


MATHEMATICAL FOUJJDATIONS OF 

STATISTICAL MECHANICS 




MATHEMATICAL FOUNDATIONS OF 


Statistical JMcchanics 


BY A. I. KHINCHIN 

TBANSLATBD from the RUSSIAN BY Q. QAMOW 


Dover Publications, Inc, 


NEW YORK 



k SA S 


r 


K UNIVERSITY 




Copyright 1949 by Dover Publie^Tfio 
All rights reserved under Pan America 
national Copyright Conventions. 



Published in Canada by General Publishing Com¬ 
pany. Ltd., 30 Lesmill Road. Don Mills. Toronto, 
Ontario. 

Published in the United Kingdom by Constable 
and Company. Ltd., 10 Orange Street. London WC 2. 



145508 


International Standard Book Number: 0-486-60147-1 
Library of Congress Catalog Card Ntimber: 49-9707 


Manufactured in the United States of America 

Dover Publications, Inc. 

180 Varick Street 
New York, N. Y. 10014 




CONTENTS 


Preface vii 

Chapter I. Introduction 

1. A brief historical sketch 1 

2. Methodological characterization 7 

Chapter II. Geometiy and Kinematics of the Phase Space 

3. The phase space of a mechanical system 13 

4. Theorem of lAouviile 15 

5. Theorem of Birkhoff 19 

6. Case of metric indecomposahility 28 

7. Structure functions 32 

8. Components of mechanical systems 38 

Chapter III. Ergodic Problem 

y. Interpretation of physical quantities in 

statistical mechanics 44 

10. Fixed and free integrals 47 

/ 1. Brief historical sketch 52 

12. On metric indecomposabilily of reduced 

manifolds 55 

13. The possibility of a formulation without the 

use of metric indecomposahility 62 

Chapter IV. Reduction to the Problem of the Theory of 

Probability 

14- Fundamental distribution law 70 

16. The distribution law of a component and its 

energy 71 

16. Generating functions 76 

17. Conjugate distribution functions 79 

18. Systems consisting of a large number of 

components 81 


V 



Vi 


Chapter V. Application of the Central Limit Theorem 

19. Approximate expressions of structure funo- 

tions 84 

50. The small component and its energy. Boltz¬ 
mann's law 88 

51. Mean values of the sum functions 93 

SS. Energy distribution law of the large component 99 
23. Example of monatomic ideal gas 100 

S4. The theorem of equipartition of energy 104 

26. A system in thermal equilibrium. Canonical 

distribution of Gibbs 110 

Chapter VI. Ideal Monatomic Gas 

26. Velocity distribution. Maxwell*s law 115 

27. The gas pressure 116 

28. Physical interpretation of the parameter 121 

29. Gas pressure in an arbitrary field of force 123 


Chapter VII. The Foundation of Thermodynamics 

SO. External parameters and the mean values of 

external forces 129 

31. The volume of the gas as an external parameter 131 


S2. The second law of thermodynamics 132 

35. The properties of entropy 137 

34- Other thermodynamical functions 145 

Chapter VIII. Dispersion and the Distributions of Sum 

Functions 

36. The inter molecular correlation 148 

36. Dispersion and distribution laws of the sum 

functions 156 


Appendix 


Notations 


The proof of the central limit theorem of the 
theory of probability 


166 


176 


Index 


178 



PREFACE 


Statistical mechanics presents two fimdamental problems for 
mathematics: (1) the so-called ergodic problem, that is the 
problem of a rigorous justification of replacement of time- 
averages by space (phase)-averages; (2) the problem of the 
creation of an analytic apparatus for the construction of 
asymptotic formulas. In order to become familiar with these 
two groups of problems, a mathematician usually has to over¬ 
come several difficulties. For understandable reasons, the books 
on physics do not pay much attention to the logical foundation 
of statistical mechanics, md a great majority of them are 
entirely unsatisfactory from a mathematical standpoint, not 
only because of a non-rigorous mathematical discussion (here a 
mathematician would usually be able to put things in order by 
himself), but mainly because of the almost complete absence of 
a precise formulation of the mathematical problems which occur 
in statistical mechanics. 

In the books on physics the formulation of the fundamental 
notions of the theory of probability as a rule is several decades 
behind the present scientific level, and the analytic apparatus of 
the theory of probability, mainly its limit theorems, which 
could be used to establish rigorously the formulas of statistical 
mechanics without any complicated special machinery, is com¬ 
pletely ignored. 

The present book considers as its main task to make the 
reader familiar with the mathematical treatment of statistical 
mechanics on the basis of modem concepts of the theory of 
probability and a maximum utilization of its analytic apparatus. 
The book is written, above all, for the mathematician, and its 
purpose is to introduce him to the problems of statistical 
mechanics in an atmosphere of logical precision, outside of 
which he cannot assimilate and work, and which, unfortxmately, 
is lacking in the existing physical expositions. 

The only essentially new material in this book consists in the 
systematic use of limit theorems of the theory of probability for 

vii 



• • • 

Vlll 

rigorous proofs of asymptotic formulas \\athout any special 
analytic apparatus. The few existing expositions which intended 
to give a rigorous proof to these formulas, were forced to use for 
this purpose special, rather cumbersome, mathematical ma¬ 
chinery. We hope, however, that our exposition of several 
other questions (the ergodic problem, properties of entropy, 
intramolecular correlation, etc.) can claim to be new to a certain 
extent, at least in some of its parts. 



CHAPTER I 


INTRODUCTION 

1. A brief historical sketch. After the molecular theory of 
the structure of matter attained a predominant role in physics, 
the appearance of new statistical (or probabilistic) methods of 
investigation in physical theories became unavoidable. From 
this new point of view each portion of matter (solid, liquid, or 
gaseous) was considered as a collection of a large number of 
very small particles. Very little was known about the nature 
of these particles except that their number was extremely large, 
that in a homogeneous material these particles had the same 
properties, and that these particles were in a certain kind of 
interaction. The dimensions and structure of the particles, as 
well as the laws of the interaction could be determined only 
hypothetically. 

Under such conditions the usual mathematical methods of 
investigation of physical theories naturally remained completely 
powerless. For instance, it was impossible to expect to master 
such problems by means of the apparatus of differential equa¬ 
tions. Even if the structure of the particles and the laws of their 
interaction were known, their exceedingly large number would 
have presented an insurmountable obstacle to the study of 
their motions by such methods of differential equations as are 
used in mechanics. Other methods had to be introduced, for 
which the large number of interacting particles, instead of being 
an obstacle, would become a stimulus for a systematic study of 
physical bodies consisting of these particles. On the other hand, 
the new methods should be such that a lack of information 
concerning the nature of the particles, their structure, and the 
character of their interaction, would not restrict the efficiency of 
these methods. 

All these requirements are satisfied best by the methods of 
the theory of probability. This science has for its main task the 
study of group phenomena, that is, such phenomena as occur in 

I 



2 


collections of a large number of objects of essentially the same 
kind. The main purpose of this investigation is the discovery 
of such general laws as are implied by the gross character of the 
phenomena and depend comparative!}^ little on the nature of 
the individual objects. It is clear that the well-kno^\m trends of 
the theory of probability fit in the best possible way the afore¬ 
mentioned special demands of the molecular-physical theories. 
Thus, as a matter of principle, there was no doubt that statis¬ 
tical methods should become the most important mathematical 
tool in the construction of new physical theories; if there existed 
any disagreement at all, it concerned only the form and the 
domain of application of these methods. 

In the first investigations (Maxwell, Boltzmann) these ap¬ 
plications of statistical methods were not of a systematical char¬ 
acter. Fairly vague and somewhat timid probabilistic arguments 
do not pretend here to be the fundamental basis, and play ap¬ 
proximately the same role as purely mechanical considerations. 
Two features are characteristic of this primary period. First, far 
reaching hypotheses are made concerning the structure and the 
laws of interaction between the particles; usually the particles 
are represented as elastic spheres, the laws of collision of which 
are used in an essential way for the construction of the theory. 
Secondly, the notions of the theory of probability do not appear 
in a precise form and are not free from a certain amount of 
confusion which often discredits the mathematical arguments 
by making them either void of any content or even definitely 
incorrect. The limit theorems of the theory of probability do 
not find any application as yet. The mathematical level of all 
these investigations is quite low, and the most important 
mathematical problems which are encoimtered in this new 
domain of application do not yet appear in a precise form.* 

It should be observed, however, that the tendency to restrict 

*An excellent critical analysis of this first period is found in a well-known 
work by P. and T. Ehrenfest which appeared in vol. IV of the Encyclo- 
paedie der Mathematischen Wissenschaften and which played a consider¬ 
able role in the development of the mathematical foundations of statistical 
mechanics. 



3 


the role of statistical methods by introducing purely mechanical 
considerations, (from various hypotheses concerning the laws 
of interaction of particles), is not restricted to the past. This 
tendency is clearly present in many modern inv'estigations. 
According to a historically accepted terminology, such investi¬ 
gations are considered to belong to the kinetic theory of matter, 
as distinct from the statistical mechanics which tries to reduce 
such hypotheses to a minimum by using statistical methods as 
much as possible. Each of these two tendencies has its own 
advantages. For instance, the kinetic theory is indispensable 
when dealing with problems concerning the motion of separate 
particles (number of collisions, problems concerning the study 
of systems of special kinds, mono-atomic ideal gas); the methods 
of the kinetic theory are also often preferable, because they give 
a treatment of the phenomena which is simpler mathematically 
and more detailed. But in questions concerning the theoretical 
foundation of general laws valid for a great variety of systems, 
the kinetic theory naturally becomes sometimes powerless and 
has to be replaced by a theory which makes as few special 
hypotheses as possible concerning the nature of the particles. 
In particular, it was precisely the necessity of a statistical 
foundation for the general laws of thermodynamics that pro¬ 
duced trends which found their expression in the construction 
of statistical mechanics. To avoid making any special hypotheses 
about the nature of the particles it became necessary in estab¬ 
lishing a statistical foundation to develop laws which had to be 
valid no matter what was the nature of these particles (within 
quite wide limitations). 

The first systematic exposition of the foundations of statistical 
mechanics, with fairly far developed applications to thermo¬ 
dynamics and some other physical theories, was given in Gibbs^ 
well-known book.^ Besides the above mentioned tendency not 
to make any hypotheses about the nature of particles the fol¬ 
lowing are characteristic of the exposition of Gibbs. 


Gibbs, “Elementary principles of statistical mechanics,” Yale 
University Press, 1902. 



4 


(1) A precise introduction of the notion of probability, which 
is given here a purely mechanical definition, is lacking 
with the resulting questionable logical precision of all 
arguments of statistical character. 

(2) The limit theorem of the theory of probability does not 
find any application (at that time they were not quite 
developed in the theorj- of probability itself). 

(3) The author considers his task not as one of establishing 
physical theories directly, but as one of constructing 
statistic-mechanical models which have some analogies 
in thermodynamics and some other parts of physics; 
hence he does not hesitate to introduce some very special 
hypotheses of a statistical character (canonical distribu¬ 
tion, ch. 25, § 25) without attempting to prove them or 
even to interpret their meaning and significance. 

(4) The mathematical level of the book is not high; although 
the arguments are clear from the logical standpoint, 
they do not pretend to any analj’tical rigor. 

At the time of publication of Gibbs’ book, the fundamental 
problems raised in mathematical science in connection with the 
foundation of statistical mechanics became more or less clear. 
If we disregard some isolated small problems, we have here two 
fundamental groups of problems representing a broad, deep, 
interesting and difficult field of research in mathematics which 
is far from being exhausted even at present. The first of these 
groups is centered around the so-called ergodic problem (ch. 
Ill), that is, the problem of the logical foundation for the inter¬ 
pretation of physical quantities by averages of their corre¬ 
sponding functions, averages taken over the phase-space or a 
suitably selected part of it. This problem, originated by Boltz¬ 
mann, apparently is far from its complete solution even at the 
present time. This group of problems was neglected by the 
investigators for a long time after some unsuccessful attempts, 
based either on some inappropriate hypotheses introduced ad 
hoc, or on erroneous logical and mathematical arguments 
(which, unfortunately, have been repeated without any criti- 



5 


cism in later handbooks). In the book of Gibbs these problems 
naturally are not considered because of the tendency to con¬ 
struct models. Only recently (1931), the remarkable work of 
G. D. Birkhoff again attracted the attention of many investi¬ 
gators to these problems, and since then this group of problems 
has never ceased to interest mathematicians, who devote more 
and more effort to it every year. We will discuss this group of 
problems in more detail in the ch. III. 

The second group of problems is connected with the methods 
of computation of the phase-averages. In the majority of cases, 
these averages cannot be calculated precisely. The formulas 
which are derived for them in the general theory (that is, with¬ 
out specification of the mechanical system under discussion) are 
complicated, not easy to survey, and as a rule, not suited for 
mathematical treatment. It is quite natural, therefore, to try 
to find simpler and more convenient approximations for these 
averages. This problem is always formulated as a problem of 
deriving asymptotic formulas which approach the precise 
formulas when the number of particles constituting the given 
system increases beyond any limit. These asjTnptotic formulas 
have been found long ago by a semi-heuristic method (by 
means of an unproved extrapolation, starting from some of the 
simplest examples) and were without rigorous mathematical 
justification until fairly recent years. A decided change in this 
direction was brought about by the papers of Darwin and 
Fowler about twenty years ago. Strictly speaking these authors 
were the first to give a systematic computation of the average 
values; up to that time, such a computation was in most cases 
replaced by a more or less convincing determination of '‘most 
probable” values which (without rigorous justification) were 
assumed to be approximately equal to the corresponding aver¬ 
age values. Darwin and Fowler also created a simple, conven¬ 
ient, and mathematically rigorous apparatus for the computa¬ 
tion of asymptotic formulas. The only defect of their theory lies 
in an extreme abstruseness of the justification of their mathe¬ 
matical method. To a considerable extent this abstruseness was 
due to the fact that the authors did not use the limit theorems 



6 


of the theory of probability (sufficiently developed by that time), 
but created anew the necessary analytical apparatus. In any 
case, the course in statistical mechanics published by Fowler® 
on the basis of this method, represents up to now the only book 
on the subject, which is on a satisfactory mathematical level.* 

In closing this brief sketch we should mention that the 
development of atomic mechanics during the last decades has 
changed the face of physical statistics to such a degree that, 
naturally, statistical mechanics had to extend its mathematical 
apparatus in order to include also quantum phenomena. More¬ 
over, from the modem point of view, we should consider 
quantized systems as a general type of which the classical 
systems are a limiting case. Fowler’s course is arranged accord¬ 
ing to precisely this point of view: the new method of construct¬ 
ing asymptotic formulas for phase-averages is established and 
developed for the quantized systems, and the formulas which 
correspond to the classical systems are obtained from these by 
a limiting process. 

Quantum statistics also presents some new mathematical 
problems. Thus, the justification of the peculiar principles of 
statistical calculations which are the basis of the statistics of 
Bose-Einstein and Fermi-Dirac required mathematical argu¬ 
ments which were distinct as a matter of principle (not only by 
their mathematical apparatus) from all those dealt with in the 
classical statistical mechanics. Nevertheless, it could be stated 
that the transition from the classical systems to the quantum 
systems did not introduce any essentially new mathematical 
difficulties. Any method of justification of the statistical me¬ 
chanics of the classical systems, would require for quantized 


*Fowler, “Statistical mechanics,” Cambridge, 1929. 

^Except, however, the well known course in the theory of probabilities 
by V. Mises. However, the main viewpoint of v. Mises differs from the tradi¬ 


tional standpoint to such an extent that the theory expounded by him 
hardly could be given the historically established name of statistical me¬ 
chanics; mechanical concepts are almost completely eliminated from this 
theory. In any case, we shall have no occasion to compare the exposition 
of v. Mises with other expositions. 



7 


systems an extension of the analytical apparatus only, in some 
cases introducing small difficulties of a technical character but 
not presenting new mathematical problems. In places where we 
might have to use finite sums or series, we operate with integrals, 
continuous distributions of probability might be replaced by 
the discrete ones, for which completely analogous limit theorems 
hold true. 

Precisely for these reasons in the present book we have re¬ 
stricted ourselves to the discussion of the classical systems, 
leaving completely out of consideration everything concerning 
quantum physics, although all the methods which we develop 
after suitable modifications could be applied without any diffi¬ 
culties to the quantum systems. We have chosen the classical 
systems mainly because our book is designed, in the first place, 
for a mathematical reader, who cannot always be assumed to 
have a sufficient knowledge of the foundations of quantum 
mechanics. On the other hand, we did not consider as expedient 
the inclusion in the book of a brief exposition of these founda¬ 
tions. Such an inclusion would have considerably increased the 
size of the book, and would not attain the desired purpose since 
quantum mechanics with its novel ideas, often contradicting 
the classical representations, could not be substantially assimi¬ 
lated by studying such a brief exposition. 

2. Methodological characterization. Statistical mechanics 
has for its purpose the construction of a special physical theory 
which should represent a theoretical basis for some parts of 
physics (in the first place, for thermodynamics) using as few 
special hypotheses as possible. More precisely, statistical me¬ 
chanics considers every kind of matter as a certain mechanical 
system and tries to deduce the general physical (in particular, 
thermodynamical) laws governing the behavior of this matter 
from the most general properties of mechanical systems, and eo 
ipso to eliminate from the corresponding parts of physics any 
theoretically unjustified postulation of their fundamental laws. 
The basic assumptions of statistical mechanics should be then 
(1) any general laws which hold for all (or at least for very 



8 


general classes of) mechanical systems, and (2) representations 
of any kind of matter as a mechanical system consisting of a 
very large number of components (particles). Thus the purpose 
of statistical mechanics consists in deriving special properties 
of such manj-'-molecular systems from the general laws of 
mechanics and in showing that, with a suitable physical inter¬ 
pretation of the most important quantities appearing in the 
theory, these derived special properties will give precisely those 
fundamental physical (and in particular, thermodynamical) 
laws governing matter in general and certain special kinds of 
matter in particular. The mathematical method which allows 
us to realize these aims, for the reasons explained in §1, is the 
method of the theory of probabilities. 

Let us make some further remarks concerning the above 
described purpose of statistical mechanics. 

1. The fact that statistical mechanics considers every kind of 
matter as a mechanical system and tries to derive all its proper¬ 
ties from the general laws of mechanics, often leads to a criti¬ 
cism of being a priori mechanistic. In fact, however, all re¬ 
proaches of such kind are based on a misunderstanding. Those 
general laws of mechanics which are used in statistical mechanics 
are necessary for any motions of material particles, no matter 
what are the forces causing such motions. It is a complete 
abstraction from the nature of these forces, that gives to sta¬ 
tistical mechanics its specific features and contributes to its 
deductions all the necessary flexibility. This is best illustrated 
by the obvious fact that if we modify our point of view on the 
nature of the particles of a certain kind of matter and on the 
character of their interaction, the properties of this kind of 
matter established by methods of statistical mechanics remain 
unchanged by these modifications because no special assump¬ 
tion was made in the process of deduction of these properties. 

The circumstance of being governed by the general laws of 
mechanics does not lend any specific features to the systems 
studied in statistical mechanics; as it has been said already, 
these laws govern any motion of matter, whether it has any 



9 


relation to statistical mechanics, or not. The specific charactei’ 
of the systems studied in statistical mechanics consists mainly 
in the enormous number of degrees of freedom which these 
systems possess. Methodologically, this means that the stand¬ 
point of statistical mechanics is determined not by the mechan¬ 
ical nature, but by the particle structure of matter. It almost 
seems as if the purpose of statistical mechanics is to observe 
how far reaching are the deductions made on the basis of the 
atomic structure of matter, irrespective of the nature of these 
atoms and the laws of their interaction. 

2. Since the mechanical basis of statistical mechanics is 
restricted only by those general laws which hold for any 
systems (or at least for very general classes of systems), of 
considerable interest for us (even before the assumption of a 
large number of components) are the results of the so-called 
general dynamics, a branch of mechanics whose purpose is the 
deduction of such laws which hold for all mechanical s 3 "stems 
and can be derived from the general laws of mechanics alone. 
This theory, evidently of a considerable philosophical interest, 
IS of comparatively recent origin. In the past it was usually 
assumed that the deductions which could be made from the 
general laws of mechanics were not sufficiently concrete to have 
any scientific interest. It developed later that the situation was 
different, and at present the constructions of general dynamics 
are attracting interest of more and more investigators. In par¬ 
ticular, all the above mentioned investigations of Birkhoff and 
of the increasing number of his disciples belong to this theory. 
It is particularly interesting to us that the methods (and 
partially, problems) of general dynamics, even before any 
assumptions are made concerning the number of degrees of 
freedom of a system under investigation, show a definitely 
expressed statistical tendency. This fact is well-known to 
anyone who has studied investigations in this field with any 
amount of attention. Thus the fundamental theorem of Birkhoff 
IS formally equivalent to a certain theorem of the theory of 
probability; conversely, the theory of stationary stochastic 



10 


processes, which represents one of the most interesting chapters 
of the modem theory of probability, formally coincides with one 
of the parts of the general dynamics. 

The reason for this can be easily recognized. The most im¬ 
portant problem of general d^mamics is the investigation of the 
dependence of the character of the motion of an arbitrary 
mechanical system on the initial data, or more precisely the 
determination of such characteristics of the motion which in 
one sense or another “almost do not depend" on these initial 
data. Such a quantity for a great majority of trajectories 
assumes values very near to a certain constant number. But the 
expression “for a great majority of trajectories" has the mean¬ 
ing that the set of trajectories which do not satisfy this require¬ 
ment is metrically negligible in some metric, that is, has for its 
measure either zero or a very small positive number. 

In this sense many propositions of general dynamics are of a 
peculiarly typical form. They state that for most general classes 
of mechanical systems the motion is subjected to certain definite 
conditions, if not for all initial data then at least for a metrically 
great majority of them. It is known, however, that propositions 
which can be formulated in such form, in most cases turn out 
to be equivalent to some theorems of the theory of probability. 
This theory from a formal point of view could be considered as a 
group of special problems of the theory of measure, namely 
such problems as most often deal with the establishment of a 
metrically negligible smallness of certain sets. It suffices to 
remember that the majority of propositions of the theory of 
functions of a real variable concerned with the notions of con¬ 
vergence “in measure", “almost everywhere" etc., finds an 
adequate expression in the terminology of the theory of 
probability. Thus it can be stated that even general dynamics 
which represents the mechanical basis of statistical mechanics, 
is a science which is filled to a great extent with the ideas of the 
theory of probability and which successfully uses its methods 
and analogies. 

As to the statistical mechanics, it is a science whose probabil¬ 
istic character is noticeable in two entirely distinct and com- 



11 


pletely independent, features: in the general dynamics as its 
mechanical basis, and in the postulate of a great number of 
degrees of freedom allowing a most fruitful application of 
methods of the theory of probability. 

3. It remains to discuss the form in which methods and results 
of the theory of probability could be utilized in determining 
asymptotic formulas which express approximately the phase 
averages of various functions in the case of a large number of 
degrees of freedom (or for systems consisting of a large number 
of particles). 

As previously mentioned, in most expositions these formulas 
are introduced without any justification. After having derived 
these formulas for some especially simple particular case (for 
instance, for a homogeneous mono-atomic ideal gas) the authors 
usually extend them to the general case either without any 
justification, or using some arguments of heuristical character. 
Perhaps a single exception from this general rule is represented 
by the method of Fowler. Darwin and Fowler, as was already 
mentioned, develop a special and very abstruse analytical 
apparatus for a mathematical justification of the method of 
obtaining asymptotic formulas, which they have created. No¬ 
where do they use explicit results of the theory of probability; 
instead, they build a separate logical structure, but, as a matter 
of fact, they are merely moving along an analytical path parallel 
to that which is used by the theory of probability in deriving 
Its limit theorems. From here only one step remains in attempt¬ 
ing to introduce a method which we consider as the most expedi¬ 
ent: instead of repeating in a complicated formulation the whole 
long analytical process which leads to limit theorems of the 
theory of probability, we attempt to find immediately the 
bridge which unifies these two groups of problems, and the 
transition formula which would reduce the entire asymptotic 
problem of the statisti<*al mechanics to the known limit theorem 
of the theory of probabilities. This is the path we will take in 
the present book. In this manner we will be able to achieve 
simultaneously two ends: from the methodological point of 
view we will make clear the role of the theory of probabilities in 



12 


the statistical mechanics; from the formal point of view we have 
the possibility of establishing the propositions of statistical 
mechanics on the basis of the mathematically exact laws of the 
theory of probabilities. In order to emphasize the two above 
mentioned points we will give in the subsequent text the formu¬ 
lation of the necessary limit theorems of the theory of probabil¬ 
ities without giving their proofs (the latter will be given in the 
appendix). We hope that such a method of presentation will 
be attractive to many of those readers who are frightened by 
the complicated formalistics of the Darwin-Fowler method. 



CHAPTER II 


GEOMETRY AND KINEMATICS OF THE 

PHASE SPACE 


3. The phase space of a mechanical system. In the sta¬ 
tistical mechanics it is convenient to describe the state of a 
mechanical system G with s degrees of freedom, by values of 
the Hamiltonian variables , ■ • • , 9 . ; pi , P 2 , • • ■ , p. . 

The equations of motion of the system then assume the 
“canonical” form 



dqj _ dH_ 
dt dp,’ 



(1 < t < s), 


where H is the so-called Hamiltonian function of the 2s vari¬ 
ables • , p, (we always shall assume it not to depend on 

time explicitly). The function H(qi , p*) is an integral of the 
system ( 1 ). Indeed, in view of this system of equations, 

dt ^ 
y mdH _ 

dQi dp, ^ dp, d^, ~ ^* 

Since system ( 1 ) contains only equations of the first order, 
the values of the Hamiltonian variables 5 , , - ■ • , p, given for 
some time < = , determine their values for any other time i 

(succeeding or preceding io)- 

Imagine now a Euclidean space T of 2 .s dimensions, whose 
points arc determined by the Descartes coordinates g, , ■ ■ • , 
p, . Then to each possible state of our mechanical system G 
there will correspond a uniquelj’’ determined point of the space 
r, which we shall call the image point of the given system; 
the whole space T we agree to call the phase space of this 
system. We shall see that, for the purposes of the statistical 




13 



14 


mechanics, the geometrical interpretation of the set of all 
possible states of the system by means of its space, appears 
exceedingly fruitful and receives a basic methodological sig¬ 
nificance. 

Since the state of the system at any given time determines 
uniquely its state at any other time, the motion of the image 
point in the phase space which represents the changes of state 
of the given system depending on time is uniquely determined 
by its initial position. The image point describes in the phase 
space a curve which we shall call a trajectory. It follows that 
through each point of the phase space there passes one and 
only one trajectory, and the kinematic law of motion of the 
image point along this trajectory is uniquely determined. 

If at the time to the image point of the system G is some 
point Mq of the space T, and at the other time t (succeeding 
or preceding ^o) some other point M, then the points Mo and 
M determine each other uniquely. We can say that the point 
Afo of the phase space during the interval of time (^o , 0 goes 
over into M. During the same interval of time every other 
point of the space T goes over into a definite new position, in 
other words all this space is transformed into itself and in 
one-to-one way, since, conversely, the position of a point at 
the time t determines uniquely its position at the time to • 
Furthermore if we keep to fixed and vary t arbitrarily, we see 
that all the set of possible changes of state of the given system 
is represented as a continuous sequence (one-parameter group) 
of transformations of its phase space into itself, which sequence 
can be considered as a continuous motion of this space in itself. 
This representation also turns out to be very convenient for the 
purposes of the statistical mechanics. We shall call the above- 
described motion of the phase space in itself its natural motion. 
Since the displacement of any point of the phase space in its 
natural motion during an interval of time At, depends only on 
the initial position of this point and the length of this interval, 
but does not depend on the choice of the initial time, the natural 
motion of the phase space is stationary. This means that the 
velocities of points of the phase space in this motion depend 



15 


uniquely on the position of these points, but do not change with 
^ the time. 

In what follows, we shall often call the Hamiltonian variables 
9i , • ’ • , p. of the given system G the dynamic coordinates of 
its image point in the space r, and any function of these 
variables the phase function of the given system. The most 
important phase function is the Hamiltonian function 
‘ • t P.). This function determines completely the me¬ 
chanical nature of the given system, because it determines 
completely the system of equations of motion. In particular, 
this function determines completely the natural motion of the 
phase space of the given system. 

^ When convenient we shall denote the set of the dynamic 

coordinates of the given mechanical system (the point of the 

phase space) by a single letter P, and, correspondingly, an 
arbitrary phase function by/(P). 

There are cases where the phase space T has a part T' with 
the property that an arbitrary point of this part remains in it 
during all the natural motion of the space r. Such a part V' 
participates in the natural motion by transforming into itself, 
and therefore it is called an invariant part of the space V. In 
what follows we shaU see that the motion of an invariant part 
plays a very essential role in the statistical mechanics. 

The special form of the Hamiltonian system (1) has as a 
^ consequence the fact, easy to foresee, that not every continuous 
transformation of the phase space into itself can appear as its 
natural motion. A natural motion is characterized by some 
special properties, and the most important of these properties 
can be formulated in two theorems on which, to a considerable 
ertent, is based the whole construction of the statistical me¬ 
chanics. We shall pass now to a proof of these theorems. 

4. Theorem of LiouvUle. The first of these two theorems 

(under shghtly more restricted assumptions) was proved by the 

^ drench mathematician LiouviUe in the middle of the past 
century. ^ 

Let M be any measurable (in the sense of Lebesgue) set of 



16 


points of the phase space F of the given mechanical system. In 
the natural motion of this space the set M goes over into an¬ 
other set Ml during an interval of time t. The theorem of 
Liouville asserts that the measure of the set Mt for any i 
coincides with the measure of the set M. In other words, the 
measure of measurable point-sets is an invariant of the natural 
motion of the space F. 

For the proof of this theorem it will be convenient to intro¬ 
duce a more uniform notation for the dynamic coordinates of 
points of the space F. Let 

■r. = = p, (i = 1, 2, • • • , s) 

and 



In this notation the canonical system (1) of the §3 can be 
written in the form 


( 2 ) 


dxj 

dt 


= X,(x. , I, , , I,.) (l<i<2s) 


For what follows let us observe that 


(3) 


d^H 


- E 


d^H 


dXi dpi dQi fit dQi dpi 


== 0 . 


If (f = 1, 2, • • • , 2s) are the values of the variables. liat 
some definite time if, , we obtain as a uniquely determined 
solution of the system (2), the system of functions 

= /.(rixr, .xjr) (1 < ^ < 2s). 

Let us agree to denote the measure of the set A by ^fflA. Then 




I = / dxi 

3 /| 


• • 


• dx 


2 $ • 


In this integral let us change the variables by setting 

= /i(<; 2/i, • • • , 3/2.), 



17 


where t is considered as an auxiliary' parameter. Since the point 
f ( 2/1 * ‘ , 2 / 2 .) of the space r obviously describes the set M 

when the point (xi , • • • , X 2 ,) describes the set M, , 

"SSiMt = / J{i\ ?/ 2 .) dy, dy^, 

where 


Vi , • • • , 2/2.) = 


d(a:, , • • ■ , Xz,) 
^(2/1 » • • • ) ^2.) ’ 


If we differentiate this with respect to i we find 


(4) 


dmu, 

di 


-L 


dt 


dyi • • • dy 


2 $ 


We can compute dJfdt by the rule of differentiation of de¬ 
terminants. We find 


(5) 

where 


dJ 

dt 


7$ 


= 




d(xi ) '' ‘ , Xt-,1 , dXj/dtf Xj^i f ' '' } J2«) (1 < t 2 s) 

d(yi , • • • , y2,) ~ ~ 


In view of the system (2), since dXijdi coincides with dxijdi^ 

Ji = ‘ V> > • " . ^z«) (1 < t < 2s) 

5(2/1 , 2/2 . • * * > 2/2.) 

But 


dXj _ dXj dXr 

5v* ^ dXr dyk 


(1 < i < 2s, 1 < A; < 2s), 


*In this book we shall, in most cases, denote multiple integrals using only 
one integral sign; the dimensionality of the integral will be determined 
^ either by the number of differentials under the integral sign, or by some 
other obvious considerations. In cases where the domain of integration is 
not indicated explicitly, the integration will be taken over the whole space. 



18 


hence the previous equality gives 


7 $ 


= z 


dXi ^(3^1 ) * * * j 2 /<_i f Xf } Xi*\ f ’ ‘ > ^ 2 t) 


fr{ dXr 


diVi , ‘ , 2 / 2 .) 


(1 <i < 2s) 


But clearly 

d(Xi j * ’ ’ ) ^i — l ) J 3/t + i f f Xgj) 


HVl , • * • , ^2.) 



if 


if 


r ^ i, 


r i 


whence 


./, = ./ 


bX, 

bx. 


(1 < t < 2s) 


On substituting into (5) and using (3) we have 


bt 


2t 




bXi 


fit dXi 


= 0 . 


Then (4) shows 


dmu, 

dt 


= 0 


which proves the invariance of the measure in the natural 
motion of the phase space. 

Corollary. In the natural motion of the phase space every 
point P, during an interval of time goes over into a uniquely 
determined other point which we always shall denote by P« • 
If f(P) is an arbitrary phase function, we shall write 

/(Pi) = m t) 

here t might be also negative. Now, let Af be a Lebesgue 
measurable set of points of the space F, of finite measure, and 
f(P) a phase fimction, Lebesgue integrable over F. By the 
Liouville theorem the volume element of the space F during 
the time t goes over into an equal volume element dVt • Let 
us consider the integral 



19 


/ f(P)dV. 

^ Ml 


and let us change the Variables by introducing as new variables 
the dynamic coordinates of the point which goes over into P 
during the time t. It is clear that: (1) the new domain of inte¬ 
gration will be the set M\ (2) under the sign of integral the 
symbolic argument P should be replaced by P, ; (3) the element 
dV , should be replaced by the equal element dV. Thus we get 


f f(P) dV, = f f{P,) dV = [ 

J J 1/ J M 


In the left hand member we also can write dV instead of 
dV, , so that 


( 6 ) 


f f(P)dV = [ f{P,t)dV. 


In particular, if the set M is invariant, then, for each t, 


( 7 ) 


f f{P) dV ^ \ f(P, t) dV. 

Ju Jm 


5. Theorem of Birkhoff. The second theorem, to the proof 
of which we have to turn now, was proved comparatively 
recently (in 1931) by G. D. Birkhoff (the form of the proof 
which we are giving here is due to A. N. Kolmogoroff). 

Let V be an invariant part of the phase space of a finite 
volume, /(P) a phase function summable over V and deter¬ 
mined at all points" P E F. 

The theorem of Birkhoff asserts that the limit 

lim i f f(P, i) dt 

V Jq 

exists for all points P of the set V, except at most of a certain 
set of measure zero (or, more concisely, almost everywhere 
nn V). The limit also exists almost everywhere when C —► — 

*The ootatioD o E A means that a is an element of the set A. 


20 


It is clear that we can interpret the quantity (1/C) Jo }(P, 0 
as the average of the function/(P) along the trajectory passing 
through P during the interval of time (0, C). The limit of this 
expression if C we shall call the time average of the 

function /(P) along the trajectory passing through P. The 
theorem of Birkhoff asserts that, for a summable function, the 
time averages exist along the trajectories passing through 
almost all points of V. 

We now pass to the proof. In what follows we set for any 
integer n 

f f(P, t) dt ^ ^ x„(P), 



Lemma 1. Almost everywhere on V for n 




The proof of this lemma is based on the Liouville theorem. 
If we introduce a new variable a in the integral defining y* . 
determined by i = n H- a, we get 


y„(P) = j I /(P, rt + a) I da 

Jo 

( 8 ) 

/(P. , a) 1 da = yo(Pn) 



since obviously/(P, n + a) = /(P^ , a). 

Let us denote by and the sets of points P belonging 
to V and satisfying respectively the conditions 

y„(P) > €71 and yo(P) > 

where e is an arbitrary fixed positive number. It is easy to see 
that in the natural motion of the space F the set during 
the time n goes over into P^.o • Indeed, in view of (8) the 



21 


inequality VniP) > is equivalent to 2/o(Pn) > tn. Hence 

f P E -Sn.n implies P„ G P^.o , and conversely. Therefore, by 
Liouville’s theorem, 

(9) m.n = , 

Now we show that the series 

n«l 

converges. In view of (9) this series can be written as 

n- I 

, If we denote by F„ the set of points of V for which mt < 

l/o(P) < (m + l)e and observe that S„,o = F„ we can 

write this series in the form 


A CP ^ 

E E = E E 


= E 

m* I 


m 


= J E i E /■ 


2/o(P) dF 


< 7 yo(P) dV = i dV I /(P, a) I da 

where clV is the element of volume of the space r. 

Since V is an invariant part of the space r, the last expres¬ 
sion, in view of (7), is 

* /„ /j I = r /j I 'iF- 

Since/(P) by assumption is summable over F, this is a finite 
number, which proves our assertion. 

By a known theorem of the metric theory of sets it follows 
^ at every point of the set F, except at most a set of measure 
zero, belongs to no more than a finite number of sets of the 
sequence {n - 1, 2, • • •). In other words, for almost all 


22 


points P G K there exists a number N = -V(P) such that 
for each n > N, 

y.iP) < €ri. 

Since e is arbitrary, Lemma 1 is proved. 

Let now for any a < b, 

^ f f(P, 0 di. 

By the definition of , if a and 6 are integers, we have 

Kb{P) = ^ (Xa + Xa + i + • ■ • + Xb_,), 

Lemma 2. If /ion(P), as n —>oo assuming integer values, has 
no limit on a set M of positive measure, then there exist two 
real numbers a and ^ (a < ^) and a part M* of the set M, 
such that ^M* > 0 and at each point P G M*, 

1{P) = lira inf /lon(P) < a, 

L{P) = lim sup/io„(P) > j3. 

n—•• 

This proposition which is almost self-evident, can be easily 
proved. Let us consider the set of all intervals (a, , 
with rational end points (the order of numeration is imma¬ 
terial). If P G M then 1{P) < L(P)®, and therefore among 

•Except for the cases where Ao„{P) or Ao„(/’) -»-«>. It is easy 

to see however that for a summable function this can occur only on a set of 
measure zero. Indeed, if we had, say, h^P) ® on a set of positive meas¬ 
ure, then, by a known theorem of Egoroff we could assert the uniformity 
of this process on a certain other set iV, mN > 0. Let 4 > 0 be arbitrarily 
large and let for n > no = no(A), hon{P) > A ou N. Assuming n > no and 
integrating over N, in view of (6) we have 

ASSSIN < f honiP) dV = - f da f f{P, a) dV 

•'If 71 Jq Jff 

dV < I I m I dV 

which leads to a contradiction since A is arbitrarily large. 


23 


the intervals there will be found the first one, say , for 
which 

l(P) LiP). 

Denote by the set of all points P of M which are connected 
in this sense with the interval , It is clear that 

M = 

1 

and that the sets Mi and for i 9 ^ k have no points in com¬ 
mon. Since TIM > 0, we will have TIM„ > 0 for at least one 
value of m. On setting a = , /3 = = M* we see 

that Lemma 2 is proved. 

Assume now that the conditions of Lemma 2 are satisfied. 
Let P G M* and consider an interval (a, b) where a < b arc 
integers. We shall call this segment a proper segment of the 
point P if the following conditions are satisfied: 

( 1 ) K,{P) > 

( 2 ) k^h'iP) < j8 for all 6'such that a < b* < b. 

We shall show that two proper segments (ai , 6 ,) and (oj, 62 ) 
of the same point P cannot partially overlap each other. 
Indeed, if we had for instance Oi < Qj < 61 < 62 » then we 
would have 

(61 — ai)/la,fc, = (a* “ + (&I “ 02)^0,ft. 

while, by ( 2 ), 

^ ^«ifti ^ ^o.ft, ^ 0 

which would lead to the contradictory relation 

/3(6i - tti) < j3(a2 — tti) -b /3(6i — Oj) = ^(bi — Oi). 

Furthermore, let us agree to call a proper segment of P a 
maximal proper segment of rank s, if its length does not 
exceed s, and if it is not contained in any other proper segment 
of P whose length does not exceed s. It is easy to see that 


24 


every proper segment of length not exceeding s is contained 
in one and only one maximal proper segment of rank s. Indeed, 
among all the proper segments of length not exceeding s and 
containing the given segment, there will be one of maximal 
length. It is clear that this will be a maximal proper segment 
of rank s. Its uniqueness follows from the fact that if there 
existed two such segments, then they would have points in 
common (as containing the given segment). But then either 
one of them would be contained in the other, and therefore 
would not be a maximal segment of rank s, or they would 
partially overlap, which has just been proved to be impossible. 

For every positive integer s let us denote by M, the set of 
points P of the set M* for which the inequality 

ho.(P) > 

holds for at least one n < s. It is obvious that every P G 
belongs to all M, when s is sufficiently large, so that 

a> 

M* = T.M. 

<•1 

and, since M, C , 

mu* = lim SKM. . 

But > 0, hence, for s sufficiently large, also > 0. 

In what follows we shall denote by 5 some fixed positive integer 
satisfying this condition. 

Lemma 3. In order that P would belong to the set M, it is 
necessary and sufficient that P would have a maximal proper 
segment (a, h) of rank s, such that a < 0 < 6. 

Proof. 1) Let P G and let n be the smallest positive integer 
for which hQ„{P) > /3, so that n < s. Then clearly, the seg¬ 
ment (0, n) is a proper segment of P. As it has been proved 
above, this segment is contained in a imique maximal proper 
segment of rank s, which satisfies all conditions of Lenuna 3. 

2) Let P have a maximal proper segment (a, b) of rank 5, 
such that a < 0 < 6. To complete the proof of Lemma 3 it is 


25 


sufficient to show that in this case hob(P) > p, because b < 
b — a < s. 

If a = 0, our statement is obvious because (a, b) is a proper 
segment of the point P, whence h^UP) > /5. If a < 0, 

(6 — a)^o6(P) = (Xo -j--ha:_i) + (xo + • • • -f Xt-i), 

or 

(b - a)hMP) = -ah,o(P) + bhot(P), 

whence 


ho6(P) = I [(f> - a)h,,(P) + ah.,(P)]. 

But, by definition of a proper segment, 

h.UP) > h.o(P) < 0, 

and since 6 — a > 0 and a < 0, 

bohiP) > ^ [(5 “ o)/3 + 00] = /9. 

Consider now any point P of the set Af, and a 
proper segment (a, b) of rank s corresponding to P in the 
sense of Lemma 3, so that a<0<6,6 — a<s. Let 5 — o = q, 

~a = p, then 0<p<g — 1. In what follows 

we denote by 5,, the segment (-p, -p -f y) and by ilf„ the 
set of all points of M, which correspond to the segment 
in the sense of Lemma 3, so that 

= i; 2 . 

p*0 

It is easy to see that in the natural motion of the space T the 

set Mo, after p units of time goes over into M^, (because 
ho,{P) = ;i_,.,_p(Pp). Thus 

= Wffo, (0 < p < q - 1). 

It is also clear that the sets M,, with different pairs of indices 
cannot have points in common. Finally, in view of formula (6), 



26 


the same relationship between the sets and 71/^, shows that 
for any summable function fp(P), 

f ^{P) dV = f ^(P, p) dV. 

From all that has been said above we conclude 


[ Xo(P) dF = E L f Xo(P) dV 

A/, p.o ‘^A/p, 


= Z Z / xoiP.) dv 

Qm I pa >0 ^ M 99 


• C-l ^ 

= Z Z / dV f(P, , t) 

I p»C) A/*^ ^0 




• I ^ itp f 1 

= Z Z / dl" / /(/’,«)da 

a-1 p-0 -'p 

= Z / dK r f(P,a)da 

Q-1 •'Afo, ^'o 

= Z f 9*0.(P) dF > /3 i; gWM 


Off 


• ff-1 


= /3 Z Z 2«A/„ = ^A/, 


9-1 pa«0 


T-his relation holds for all 5 sufficiently large; on allowing 
s —>co we get; 


( 10 ) 


/ XoCP) dF > mM* 


Since for all points of M* we also have 

lim inf /io„(P) < a, 


n'-**co 


we can prove in the same way that 


( 11 ) 


f Xo(P) dF < 


aTlM 



27 


Since a < the inequalities (10) and (11) are contradictory, 
^ which shows that our assumption TIM* > 0 is not possible, 
or in other words, that the limit 

lim hon(P) 

A 

must exist almost everywhere. 

1 o accomplish the proof of Birkhoff’s theorem it remains to 
remove the restriction that the parameter n assiunes only 
integral values. This is easily done by means of Lemma 1. 
Indeed, since the expression 

^ b Jo 

(where [b] is the largest integer contained in 6), as 6 -^co 
differs from (l/[6]) /i’’’ /(P, i) dt only by a factor which tends 
to 1, and since the latter expression has a limit almost every¬ 
where, the limit 

'im I / f{P, t) dt 

6-»a> 0 Jo 

also exists almost everywhere. On the other hand 

1 1 




[b] 

by Lemma 1. Hence the limit 




GO 


) 


lira I [ f{P, i) dt 

6“»® vr Jq 

also exists almost everywhere, which completes the proof of 
the theorem of Birkhoff. 



28 


6, Case of metric indecomposabUity. We shall call the 
quantity 

f{P) = lim ^ f{P, t) dt 

C -*® ^ *'0 

the “time average’* of the function f{P) along the trajectory 
passing through P. Such a terminology, strictly speaking, 
becomes suitable only after we know that this quantity does 
not depend on the choice of the initial point on the given 
trajectory, in other words, that for all P and t 

f{p,) = m 

(assuming of course that /(P) exists). We shall prove this 
property. 

Let, for definiteness, t > 0. Since, by assumption, the limit 

^im f(P, a) da = /(P) 

exists, and since the difference 

1 1 
c L " cm I 

t 1 

-icT-.J. 

we also have 

(12) Im i f{P, a) da = /(F). 

But 

i j° f{P, ,a)da = ^fj f(P, t + a)da 

1 f**® 1 1 /“ 

= i I /(P, a)da = ^J^ f(P, a)da-^j^ /(P, a) da. 



29 


In the right-hand member the first term tends to /(P), by (12), 
^ and the second term tends to 0 as C ^ . Hence 

f(P. ,a) da = f(P). 

V 0 

By definition this limit is f{P,), which proves our assertion. 

e turn now to the discussion of the most important special 
case of Birkhoff’s theorem. Let V be some invariant part (of 
finite volume) of the space F. We shall call this part metrically 
indecomposable if it cannot be represented in the form 

V =V,-\-V, 

^ where Vi and Vz are invariant parts of positive measure. In 
order to understand clearly the content of this notion, observe 
that the set F, as any invariant set, is a certain set of complete 
trajectories. If by any method we separate this set of tra¬ 
jectories into two other sets (each consisting again of complete 
trajectories), then, if V is metrically indecomposable, only one 
of the following two cases is possible: either one of the com¬ 
ponent parts has measure zero (hence the other has measure 
9I2F), or both components are not measurable. In the case the 
set V is metrically indecomposable, Birkhoff's theorem can be 
made considerably more precise. 

Iheorem: If the set V is metrically indecomposable, then 
^ alnmst everywhere on V 

-ml 

The quantity in the right-hand member of this equality 
could be interpreted as an average of the function / on the 
set F. We shall call it the phase average of the function /(on 
the set F), and denote it by /. Thus the above stated theorem 
asserts that, in the case of the metric indecomposability of the 
set F, the time average f(P) of any summable function /, for 
^ almost all initial points P is the same, and coincides with the 
phase average / of the same function. 

In order to prove our theorem, let us prove first that the 



30 


f{P) is constant almost everywhere on V. Otherwise, there 
would exist such a real number a that in splitting V into two 
parts Vi and V 2 which are defined respectively by the condi¬ 
tions f{P) > a on and f{P) < a on F 2 , we would have 
TlVi > 0 and > 0/ But, by what was proved at the 
beginning of this paragraph, the sets Vi and V 2 are invariant, 
which implies a contradiction to the set V being metrically 
indecomposable. Thus f{P) almost everywhere on V has the 
same constant value which we denote by a. It remains to prove 
that a = f. 

Write fciP) = ^ f" f(P, t) dt 

We have 



By the invariance of the set V (see (7)), 

aii? L ~ cWv L X ^ 

- (Sr X' <" X «'■) “''-ml 


^In order to prove this, for any positive integer n, let us subdivide the 
axis of reals into segments (A;/2*, {k + l)/2")(— a> < k < ®), and let us 
call such a segment an essential segment, if the set of points F G F for 
which values of f{P) belong to this segment, has a positive measure. If for 
some value of n there exist two essential segments, our assertion is proved. 
If, however, for every n there exists only one essential segment , then 
clearly < 5n , so that the sequence of segments 5n(n =» 1,2, • • •) has a 
single point in common a. It is quite obvious that, in this case, f{P) ■“ a 
almost everywhere on V. 



31 


Hence 


and the quantity 

~ dV = a — f 

does not depend on C. Our theorem will be proved if we show 
that this integral is equal to zero. 

Let 6 > 0 be arbitrarily small. Let Fi(C) be the set of those 
points P ^ V for which 

I a - fciP) I < € 

while 


V,(C) = 7 - V,{C). 


It is clear that 


[ [a - fciP)] dV < f I a - fc{P) I dV 

‘'v •'K,(C) 


( 13 ) 


+ / I a - fc(P) 

•'v.co 


dV 


< + I a I 5Kr,(0 + [ I MP) I dV. 

•'V,(C) 


00 


Since fc(P) tends to a almost everywhere on V as (7 - 
^VsiC) 0 as C —(convergence in measure). Thus, for 
C sufficiently large, we have 91272(0 < e. But 


( 14 ) 




^hl I II 

^ •'0 ''y,(c.i) 



32 


where V 2 {C, t) is the set into which V 2 {C) goes over during 
the inter\'al of time t, in the natural motion of the phase space. 
By the theorem of Liouville, for all t, 

‘^V2{C, t) = 9^72(0, 

and S0?72(C', 0 0 as C uniformly with respect to t In 

view of the absolute continuity of integrals of summable 
functions, we can take C so large that, for all ty 


\ m \dv <, 

Then (14) also shows that, in this case. 


/ I fciP) 

'' V,(C) 


dV < e 


and then (13) gives 


/ [a-fc(P)]dV 


< + I a I « + e 


Thus the left-hand member of this inequality will be as small 
as we like if C is sufficiently large, and since it does not depend 
on C, it must be equal to zero, which had to be proved. 


7. Structure functions. From point of view of physics, the 
most important phase function of the given mechanical system 
is its total energy 

E = F(?i , • • • , ; p, , • • • , p.). 

For an isolated system this function has a constant value, in 
other words, represents an integral of the system of equations 
of motion. Therefore, for any constant a the region of the 
phase space for the point of which £* = a, is an invariant part of 
the phase space. For simplicity, we shall call such regions sur¬ 
faces of constant energy. We shall consider only such cases 
where the function E has a lower bound over the whole space T 
(this is the case for the most interesting physical systems). 



33 


Using the arbitrariness of choice of the addition constant in the 
expression of the potential energy (which enters as a term in 
the expression of the function E), we may assume that a 
lower bound of E is zero, so that ^ > 0 on the whole space r. 
Furthermore we always shall assume that the portion of the 
phase space characterized by the inequality E < for each 
X > 0 is a simply connected domain bounded by the surface 
E = X. This surface we shall assume to be closed and sufficiently 
smooth to justify analytical methods which will be applied in 
various problems. We shall denote by S, the surface of constant 
energy E = x. For Xi < Xz the surface 2,. is situated entirely 
inside S,,, so that, schematically, the family of surfaces of 
constant energy could be represented as a family of concentric 
hyperspheres. In the natural motion of the phase space each 
surface of constant energy, and also each domain bounded by 
two such surfaces, is transformed (into itself), in other words, is 
an invariant part of the phase space. 

All assumptions made above are satisfied by the systems 
which are usually considered in the physical applications of the 
statistical mechanics. Moreover, the total energy of such sys¬ 
tems coincides with the Hamiltonian function. It follows first 
that if .E is given as a function of the dynamic variables, the 
mechanical nature of the given system is completely deter¬ 
mined. Secondly, we can then use the argument which we used 
in §3 to prove that the Hamiltonian function is an integral of 
the system of equations of motion, as a proof of the law of 
conservation of energy for the systems we are going to consider. 

Let us denote by V(x) the volume of the part F, of the 
space r, in which E <. x (that is, of the domain inside the 
surface 2,), V{x) is a monotone function which increases from 
0 to CO as X varies between the same limits. If Xi < Xa the 
volume of the layer enclosed between the surfaces 2„ and 
2,. , is equal to ^(xa) — F(x,). 

In what follows we shall use the following theorem. 

Theorem: Let/(P) be a point function in the space T, sum- 
mable over a certain domain contained inside the domain F, . 



34 


Then 




dX 

grad E' 


(Here dV and d2 denote respectively the volume elements of 
the space T and of the surface 2, , and 


grad E = 


i 

z 

km 1 





is the gradient of E.) 

For the proof observe that the element of the volume dV of 
the domain V* in the left-hand member can be replaced by 
the product dn d2, where d2 is the volume element of that 
surface of constant energy to which abuts the element dF, and 
dn the element of the outward normal to this surface (the 
thickness of the layer separating this surface from another sur¬ 
face immediately adjoining). Such a change signifies merely a 
special choice of the subdivision of the domain F, , which we 
use to construct the integral, the choice which is characterised 
by the fact that initially the domain is subdivided in infinitely 
thin layers by a net of surfaces of constant energy. Thus 




dZ dn. 


To simplify, let us denote the dynamic coordinates of the point 
P by Xi j X 21 • • • , X 2 , , as it was done in §4. Since 


dxi — dn cos (n, x<) (1 < f ^ 2s), 


the increment of the energy when we pass from some surface 
of constant energy to an infinitely near surface will be given 
by the formula 

dE = '^ dXi — dn ^ cos (n, x.). 

<-1 oXj aXi 

But it is known that 


cos (n, Xi) 


dE/dXj 
grad E 


(I <i < 2s) 



35 


whence 


and we get 



2 « 

i: 


dE 

dXi 


grad E 


dn grad E 




dZ dE 
grad E 



dZ 

grad E' 


The value of the inner integral here is a function of E. On 
denoting this function by ^(.E) we get 


5 /„ ««- s /.' - 1 . gsrs 

as was to be proved. 

Since, by the law of conservation of energy, each surface 
2, of constant energy of the space F, is an invariant part of 
the space F, in the natural motion of this space, every measur¬ 
able set M situated on this surface, during any interval of 
time goes over into another measurable set situated on the 
same surface. However, if we define the measure of the set M 

by 

(15) <mM = f dX 

this measure, in general, would not remain invariant. The set 
M moving on the surface 2, in the natural motion of the phase 
space, would at the same time change its measure. Such a 
situation would have been extremely inconvenient for our 
theory, since, in discussing motion on the surface 2, we would 
have been deprived of such valuable tools as the theorem of 
Liouville, Birkhoff and their corollaries. That is why in the 
statistical mechanics the definition (15) of measure on the 
surface 2, is always replaced by another definition which is 
invariant with respect to the natural motion of the space F. 
After such a replacement we can consider each surface of con¬ 
stant energy as a bounded region, invariant to the natural 



36 


motion of which all the results obtained in preceding paragraphs 
can be applied. In the construction of our theor)^ which follows 
we make precisely this choice. 

In order to obtain an invariant definition of measure on the 
given surface S, of constant energy E = Xy consider any 
measurable (in the sense of (15)) set M on it. At each point 
of this set draw the outward normal to the surface 2, to its 
intersection with the infinitely near surface 2,+^* . The part 
of the space T which is filled by these normals is boimded and 
will be denoted by D. The volume 

[ dV 

J D 

of this part is clearly invariant with respect to the natural 
motion of the phase space and can be represented also in the 
form 

I m dv, 

where /(P) has value 1 or 0 according as P does, or does not, 
belong to D. The ratio of this volume to Ax and also the limit 
of this ratio as Ax —> 0 are also invariants of the natural motion 
of the space r. But by the theorem which just has been proved 
this limit is 

f UP) dx ^ r _JX_ 
grad P Jj, grad P* 

Thus if we define 



d2 

grad E 


we will have an invariant definition of measure on the surface 
2, . This definition of measure we shall use in all that follows 
(it is obvious that it satisfies all conditions which a definition 
of measure has to satisfy). 

In particular, the measure (volume) f](x) of the whole surface 
2. wiU be 



37 


n(x) = [ 

.'i, grad E 

I in the general theorem proved above, we 

(17) U{x) = V\x). 

Thus, the measure of the whole surface of constant energy, with 
our definition of measure, is simply equal to the derivative with 
respect to x of the volume of the domain F, of phase space 
bounded by this surface. This fact considerably simplifies the 
geometry of the structure of the phase space, in which we are 
interested now. 

According to our definition of measure we shall interpret the 
expression 

7 = f f(P) — 

as the average of any summable function f(P) defined on S* . 
This is the limit, as Ax —> 0, of the average of f(P) on the layer 
enclosed between the surfaces 2, and 2,+^* • By the theorem 
proved above this average can be also represented in the form 

' - 5R E X, ■'<" 

This formula in many cases turns out very convenient for 
evaluation of averages of phase functions on surfaces of con¬ 
stant energy. 

The function Q{x) defined by (16) is a monotone function 
increasing from 0 to co® as x varies between the same limits. 
As we shall see later, this function completely determines the 
most important features of the mechanical structure of the 
corresponding physical system. In what follows we shall call 
this function the structure function of the given system. 
Therefore, the stmcture function of the given system can be 


(16) 

If we put f{P) = 
obtain 


'Footnote of the tranfdator. This appears aa an additional assumption. 



38 


defined either as the measure of the surface of constant energy 
(with our special definition of measure) or as the derivative 
with respect to x of the function V{x) defined above. 

8 . Components of mechanical systems. In this pai*agraph, 
as we have done several times before, we denote by , X 2 , 

• • • , X 2 , the dynamical coordinates of a point of the space F, 
where the order of numeration is irrelevant. Each phase func¬ 
tion and, in particular, the total energy E of the given system, 
is a function of these 2s vaiiables. 

Suppose that the function 

E = E{x, , • • • , 3*2,) 

can be represented as a sum of two terms E^ and E 2 where the 
first term depends on some (not all) of the dynamical co¬ 
ordinates and the second term depends on the remaining co¬ 
ordinates. Since the order of numeration of the dynamical coor¬ 
dinates is irrelevant we may write E = Ex E 2 where 

Ex = Ex{xx , X 2 , • • • , xt), 

~ ■^2(X|fc + i , Xk^2 > ' * ’ ) X 2 ,). 

In such a case we agree to say that the set (xi , Xj , • • • , X 2 ,) 
of the dynamical coordinates of the given system is decomposed 
in two components (x, , • • ■ , x*) and (x**, , • • • , Xj,). We 
could express it also by saying that the given system consists 
of two “components” which appear as bearers of the corre¬ 
sponding sets of the dynamic coordinates. From the point of 
view of the formal theory it does not make any difference 
whether we call a component of the given system the set of 
coordinates (xi , • • • , x*) itself, or if we attribute to this set 
a certain “bearer” to which we shall give the name of a com¬ 
ponent. We shall use both teiminologies without danger of any 
confusion. From a more realistic point of view it appears natural 
to try to interpret each component as a separate physical system 
which is contained in the given system. However, such a 
viewpoint will be too narrow, and in some cases will not suit 



39 


our purposes. The point is that, although each materially 
isolated part of our system determines in most cases a certain 
component of this system, it is useful to consider occasionally 
such components (that is sets of coordinates) to which there 
does not correspond any materially isolated part of the system. 
The isolated character of such components is of a purely energy 
natiue, the precise sense being given by the above definition of 
a component. For instance, if the system consists of one material 
particle the components of the velocity of which and the mass 
are respectively u, v, tv, m, and if its energy E reduces to the 
kinetic energy 


F = I + w^), 

we could consider the quantity u as a component of our sys¬ 
tem, and formally attribute to it a certain “bearer’^ whose 
energy is (mu“)/2, although in this case there is no question 
of any material bearer (we shall see later that such considera¬ 
tions can prove to be useful). 

In any case, if to each component of the given system we 
may attribute a definite energy, (from the definition of a 
component), then each component, being essentially a group 
of dynamic coordinates, has its own phase space, and the state 
of this component (that is the set of values of its coordinates) 
is represented by a point of this phase space. The phase space 
r of the given system is clearly the direct product of the phase 
spaces Fi and Fa of its two components, and the volume ele¬ 
ment of the space can be taken to be equal to the product 
dVi dVi of the volume elements of the spaces Fi and Fg . 

Furthermore, each component has its own structure function. 
The law of composition of structure functions, that is the 
formula which determines the structure function of the given 
system in terms of structure functions of its components, is 
one of the most important basic formulas of our theory. We 
now pass on to the derivation of this formula. 

First we make the following observation: If we have a phase 
function of the given system whose value is completely de- 



40 


termined by the value of the energy of the system at the 
corresponding point of the space F, then the integral of such 
a function J{E), taken over the domain of the space F en¬ 
closed between two surfaces of constant energy 2^. and 2^, , 
can be easily expressed in the form of a simple integral, namely, 

( 18 ) f f(E)dV= rf(x)Q{x)dx. 

Indeed, we may evaluate our multiple integral by subdivid¬ 
ing the domain Xi < E < X 2 of the space into infinitely thin 
layers between surfaces of constant energy. In the layer be¬ 
tween the two surfaces 2, and 2^+^^ the function/(£^) (which 
for simplicity is supposed to be continuous) can (with an in¬ 
finitesimal error) be assumed to be equal to f{x), while the 
volume of this layer, up to infinitesimals of higher order, is 

V''(x)Aa: = fi(x)Ax 
which gives formula (18). In particular, 

( 19 ) j^f{E)dV - J f{x)Q(x) dx. 

This formula is used in a great number of applications. 

Now let V{x) and fl{x) be the functions as defined above for 
the given system, while Fi(x), Q,(x) and V^ix), the 

corresponding functions for the two components of the given 
system. Then 

V{x) = [ dV = [ dV^ [ dV 2 
= [ V^ix - Ed dV, , 

where (Fx)i denotes the set of all points of the space F, at 
which El < X, and is defined in an analogous fashion. 

Since the phase function V^ix — Ed of the first component 
depends only on its energy Ei , by formula (18) we have 

V{x) - [ F,(x - Ed ^x{Ed dEi 

J<y 



41 


and since ^ 2(1 — ^ 1 ) = 0 for > x ^ the integration can 
be extended to infinity so that 

K(x) = f FaCx - y)Qi(y) dy. 


Finally, on differentiating this with respect to x, we have 




“ y) dy. 


This is the law of composition of structure functions which 
we intended to establish. 

All that has been said above, without any modifications, can 
be extended to the case where the given system consists not 
of two, but of any number of components. The definition of 
component remains unchanged. As before, the space T is the 
direct product of the phase spaces of all the components. For 
the law of composition of structure functions, in case of n 
components, we have the formula 

(21) n(x) = /{n HiCi.) 2 X,), 

where the integration is extended over the whole space of 
(n — 1) dimensions (or over the domain x< > 0, 1 < f < n — 1, 
which is the same, since n<(x<) = 0 for x< < 0). To derive this 
formula it is simplest to use the method of complete induction 
from n to (n -h 1), by decomposing the n-th component in 
(21) into two components, and by expressing the last factor in 
terms of structure functions of these two components, using 
formula (20). 

To conclude these brief preliminary considerations, we re¬ 
mark that the conception of decomposition of the system into 
components, leads to a specific methodological paradox, as has 
been observed several times. As stated already at the beginning 
of Chapter I, with all the generality and abstractness of the 
hypotheses of the statistical mechanics, it is invariably assumed 
that particles of the matter are in a state of intensive energy 


42 


interaction, where the energy of one particle is transferred to 
another (for instance by means of collisions). As we shall see 
in more detail later, the statistical mechanics bases its method 
precisely on a possibility of such an exchange of energy be¬ 
tween various particles constituting the matter. However, if 
we take the particles constituting the given physical system 
to be the components in the above defined sense, we are ex¬ 
cluding the possibility of any energetical interaction between 
them. Indeed, if the Hamiltonian function, which expresses the 
energy of our system, is a sum of functions each depending only 
on the dynamic coordinates of a single particle (and repre¬ 
senting the Hamiltonian function of this particle), then, clearly, 
the whole system of equations (1) splits into component sys¬ 
tems each of which describes the motion of some separate 
particle and is not connected in any way with other particles. 
Hence the energy of each particle, which is expressed by its 
Hamiltonian function, appears as an integral of equations of 
motion, and therefore remains constant. 

The serious difficulty so created is resolved by the fact that 
w'e can consider particles of matter as only approximately 
isolated energetically components. There is no doubt that a 
precise expression for the energy of the system must contain 
also terms which depend simultaneously on the energy of 
several particles (mutual potentials of particles), and which 
assure the possibility of an energetical interaction between the 
particles (from a mathematical point of view, prevent the 
splitting of the system (1) into systems referring to single 
particles). But, inasmuch as forces of interaction between the 
particles manifest themselves only at very small distances, such 
mixed terms in the expression of energy, representing mutual 
potential energ>' of particles, will be (in the great majority of 
points of the phase space) negligible as compared with the 
kinetic energy of particles or with the potential energy of 
external fields. In particular they will contribute very little in 
evaluating various averages, hence in the majority of compu¬ 
tations in statistical mechanics we will be able to neglect such 
terms, and, to a good approximation, assume that the energy 


43 


of the system is equal to the sum of the energies of constituent 
particles, these thus appearing as components of our system in 
the above defined sense. However, these mixed terras which 
are neglected, from the point of principle play a very important 
role, since it is precisely their presence that assures the possi¬ 
bility of an exchange of energy between the particles, on which 
is based the whole method of the statistical mechanics. 


CHAPTER III 


ERGODIC PROBLEM 

9. Interpretation of physical quantities in statistical me- 
chanics. The values of physical quantities which characterize 
the state of the system we are studying are determined uniquely 
by this state, which, in turn, is described in our theory by 
the set of the dynamic coordinates. Thus a physical quantity, 
as a rule, appears as a function of the dynamic coordinates 
of the system, or, what amounts to the same thing, a function 
of a point of its phase space as its phase function. Therefore, 
if, we wish to compare the deductions of our theory with the 
experimental data from measurements of various physical quan¬ 
tities, we will compare the values of various physical quantities 
found experimentally with the values of the corresponding 
phase functions furnished by our theory. However, such a 
statement of the problem leads immediately to a series of 
methodological difficulties which threaten to leave this problem 
without any content. The point is that the phase fimctions of 
the system in general represent quantities which assume widely 
distinct values for different states of the system. In order to 
compare these values with experimental data we should have 
a possibility of determining the state of the system at the time 
of the experimental measurement, that is, to determine the 
values of all the dynamic coordinates for this time. For in¬ 
stance, in the case of a gas, this would mean to determine at 
least the positions and the velocities of all constituent mole¬ 
cules, a problem which obviously is insoluble. If we forsake 
this idea, then what states of the system we should assume in 
order to compute those values of the phase functions which 
will have to be compared with the experimental data is an 
entirely open question. 

The following considerations will allow us to alleviate to a 
certain extent the acuteness of this difficulty. An experiment 
or an observation which gives the measurement of a physical 

44 



46 


quantity is performed not instantaneously, but requires a cer¬ 
tain interval of time which, no matter how small it appears to 
us, would, as a mle, be very large from the point of view of an 
observer who watches the evolution of our physical system. 
This system will be subjected during this interval of time to 
various perturbations (such as mutual collisions of molecules) 
which maj'’ change essentially the values of the corresponding 
phase function. Thus we will have to compare experimental 
data not with separate values of phase functions, but with their 
averages taken over very large intervals of time. In other words, 
according to what was said in the preceding chapter, with time 
averages of phase functions over a trajectory which represents 
the evolution of our physical system. 

This consideration of course changes the picture quite con¬ 
siderably, but, at the same time, immediately introduces new 
difficulties. The first of these arises from the fact that the time 
averages of a given phase function taken over a given trajectory 
may have widely distinct values for different time intervals. 
This difficulty is alleviated considerably by the theorem of 
Birkhoff, which states that, for almost all trajectories, the time 
averages of the given phase function, which tend to a definite 
limit when the time interval tends to infinity, will assume 
approximately the same value for all time intervals, sufficiently 
large. It is therefore natural to take this limit as the time 
average furnished by our theory. 

There is, however, another difficulty which is much harder 
to overcome, namely, that we cannot determine which tra¬ 
jectory in the phase space is traversed by our system. If this 
system has s degrees of freedom (where .s, as a rule, is a very 
large number), in order to determine tliis trajectory we would 
need to find values of (2s — 1) integrals of the system, which 
do not depend on time, while actually we can determine ap¬ 
proximately values of only very few of these integrals. (The 
value of the energy of the system almost always is considered 
to be given.) The determination of any integral gives qs in the 
phase space a surface which contained the trajectory in men¬ 
tion. If we know the values of k of such integrals, then we know 


46 


that our trajectorj^ belongs to a certain “reduced” manifold of 
(2s — k) dimensions, so that for A: = 2s — 1 the trajectory will 
be completely determined; if however, as usually happens, we 
know only one integral of energy, then k = I, and the only 
thing we can say about the trajectory, is that it belongs to a 
manifold of (2s — 1) dimensions (surface of constant energy). 

There is however a case where this difficulty does not exist, 
in view of the theorem of section 6. If the given surface at the 
constant energy is metrically indecomposable, then the time 
averages of any summable function are the same for almost all 
trajectories and coincide with the phase average of this func¬ 
tion on the given surface of constant energy. In this case every 
physical quantity receives a definite interpretation in our 
theory as the phase average of the coiresponding phase func¬ 
tion, and the above mentioned difficulties no longer exist. 

Actually, in all expositions of the statistical mechanics, this 
phase average is taken as a theoretical interpretation of any 
physical quantity. In doing so either no arguments at all are 
given in favor of such a choice, or a special hypothesis is con¬ 
structed in order to justify this choice, or, finally, various 
reasons are cited in favor of such an interpretation, indicating 
at the same time that these reasons are not logically obligatory 
and that the interpretation was generally accepted in view of 
the successful results to which the theory based on this in¬ 
terpretation leads. The last method appears to us most prefer¬ 
able scientifically, and, in the following paragraphs of this 
chapter, we shall attempt to discuss in detail the most im¬ 
portant questions pertaining to the subject, from the point of 
view of modem ideas. 

At present we remark in addition that, in view of what has 
been said above, the task of a mathematical justification of the 
statistical mechanics reduces essentially to two problems. The 
first of these two problems, is to investigate as exhaustively as 
possible, under what conditions and to what degree the time 
averages of phase function, which, as we have seen, appear as 
a natural interpretation of experimental measurements, can be 



47 


replaced by the phase averages of the same functions. The 
desirability, and even inevitability, of such a replacement is 
clear: The computation of time averages requires the knowledge 
of trajectories, that is, the complete integration of the equa¬ 
tions of motion and determination of all the constants of 
integration, which of course cannot be done for systems con¬ 
sidered in statistical mechanics, with their large numbers of 
degrees of freedom. As said before, we shall discuss the question 
connected with this first problem in the following paragraphs 
of the present chapter. 

The second problem which will be considered in the next 
chapters, is to create a general method for approximate compu¬ 
tation of phase averages or surfaces of constant energy. The 
evaluation of phase averages, contrary to the evaluation of 
time averages, is a problem completely accessible to a mathe¬ 
matical analysis, although it involves certain difficulties. This 
problem is always formulated as a problem of constructing a 
general method which would allow us to derive sufficiently 
simple asjunptotic formulas for phase averages, under the as¬ 
sumption that the number of degrees of freedom of the given 
system increases beyond limit. Since statistical mechanics deals 
with systems with very large degrees of freedom, we may 
expect that such asymptotic expressions will be sufficiently near 
to precise values of phase averages. 

10. Fixed and free integrals. The problem of a theoretical 
justification of the replacement of time averages by phase 
averages, is usually called the ergodic problem (sometime this 
terminology is used for other related problems.) Almost always, 
one considers the averages of phase functions on a given 
surface of constant energy. Therefore, in attempting to give a 
short account of the history and of the present status of the 
ergodic problem we first, should try to understand why in our 
theory we choose precisely these phase averages. From a purely 
theoretical point of view, such a choice at first glance, appears 
casual and arbitrary. Usually such a concept of phase averages 


48 


is justified by the following argument. Since the energy is an 
integral of the equations of motion, each trajectory is situated 
entirely on some surface of constant energy, S* . The values 
of the function under consideration, at the points of the phase 
space r which are not on this surface 2^ , play no role in 
evaluating the time averages, and therefore should not be taken 
into account in evaluating the phase averages, if we desire that 
these phase averages be near the time averages. 

Such an argument contains a vulnerable point. Everything 
which is said therein about the energy integral can be re¬ 
peated, word for word, for any other integral of motion, which 
does not depend on time. Since, for a system with s degrees 
of freedom, there are (2s — 1) of such independent integrals, 
we should fix the values of each of them beforehand, or, in 
other words, determine the trajectory of the system in the 
phase space, and evaluate our averages along this trajectory. 
This, however, is never done, and is not feasible to do, because 
the great majority of other integrals of motion are not known, 
so that we cannot determine the trajectory which represents 
the evolution of our system. 

Thus the whole question requires more careful consideration. 
It will be most convenient for us to start by making more 
precise the above argument in favor of a preliminary specifica¬ 
tion of a surface of constant energy S, . In itself this argument 
is not entirely convincing, but only serves as a starting point 
of our discussion. 

Suppose that we do not specify the surface 2, , but try to 
evaluate the phase averages of our functions over the whole 
space r. The first, and comparatively non-essential difficulty 
here is due to the fact that this space has an infinite volume, 
so that averages of simplest functions would become infinite 
or undetermined, if we would not introduce a preliminary 
weighting of the space with the purpose of diminishing the 
contributions by distant portions of the space. However, such 
a weighting of various parts of the space r would necessarily 
introduce some element of arbitrariness, which would make the 



computation of phase averages based on this weighting, some¬ 
what doubtful/ 

However, this difficulty, as we have observed before, is not 
essential in comparison with the other difficulty which ap¬ 
parently makes the whole method completely useless. Indeed, 
the energy of the given system is a phase function, and un¬ 
doubtedly, one of the most important ones. Our method should 
attribute to it some definite average value E, as for any other 
phase function. But what physical meaning could this average 
have? In particular, could we expect that the time average of 
the energy of the given system will be equal to E (or at least 
near to E) in the majority of the evolutionary processes of 
which the system is capable? It is sufficient to formulate this 
assumption to understand its absurdity. In each evolutionary 
process the given isolated system preserves a constant value of 
its energy. This constant value we can select, in general, 
arbitrarily over a rather wide range, and in different cases we 
can select different values which are quite far from any fixed 
number prescribed by the theory. The very attempt to attribute 
to the energy of our system any fixed value, no matter by what 
method this value is computed, contradicts reality. And so, the 
preliminary reduction of the phase space to some surface at 
constant energy appears really inevitable for any efficient evalu¬ 
ation of the phase averages. 

Let us now investigate why we may pay no attention to all 
other integrals independent of time and treated in the same 
way as we have treated the energy. Such claims, as we have 
already observed before, appear well founded, at least at first 
sight. However, a more attentive consideration will show that 


‘Here the question is one of introducing weights “universal” for the 
given system. A weighting adjusted to a definite value of the energy (or of 
some other integral) as is done, for instance, in the so-called canonical 
distribution of Gibbs (see Chapter V, section 25) is equivalent to a prelim¬ 
inary specification of the surface S, , and therefore is not of interest to us 
at present. 



50 


the situation is different. For better understanding we shall 
split our argument into several steps. 

1. As we shall see later, the majority of physical phase func¬ 
tions with which we have to deal in statistical physics have a 
specific stnictui-e which makes the values of such a function, 
defined on every surface 2, , very near each other at all points, 
except for a set of a very small measure. This implies that for 
the majority of the trajectories situated on 2^ , the time 
averages of such a function will have values very near each 
other, and therefore near the phase average of the function 
over the surface 2, . 

2. Let now I be any integral of the equations of motion of 
the given system which does not depend on time and is dis¬ 
tinct from the energy integral. If, considered as a phase func¬ 
tion it has a structure described under 1, then the possibility 
of replacing the time average by the phase average for this 
integral does exist. If, however, I does not have such a structure, 
then the phase function which it represents, as a mle, will not 
have actual physical interpretation, and therefore the rela¬ 
tionship between its various averages will not present any 
interest for us. 

3. It is possible however that in some cases the arguments 
of 2 cannot satisfy us even if they are quite correct. Such cases 
occur when the integral I represents a physical quantity which 
plays a role analogous to the role of the energy, that is a 
quantity for which we are able to select a value arbitrary 
within certain limits, by regulating the conditions of our 
process, or at least are able to determine its value experiment¬ 
ally. For instance, let I be the phase average of the function / 
over the surface 2^ . In view of what has been said in 1 and 2, 
the time_averages of the function / for most trajectories will 
be near /. But if, for various reasons, we forced I to assume 
a value Iq which is far from 7, or if we have found such a value 
by experimental measurement, _then of course we have to 
attribute to I the value Iq , not 7 . The fact that in most cases 
I is near 7, cannot force us to accept this relation if we know 
that in our case (whether due to our interference or not) the 



5J 


value I is far from I. Furthermore, if we know the actual value 
of 7, we will have to take it into account in computing values 
of other phase functions. In other words, in such a case the 
value of the integral I could and should be specified beforehand, 
just as has been done with the integral of energy.^ 

To abbreviate, let us call the integrals of type described 
above —controllable integrals (because we can either select, or 
determine experimentally, its value; in other words we control 
its value in the process we consider). Let our system have k 
such controllable integrals (they will amost always include the 
integral of energy). If we fix the value of each of them in our 
process, we shall specify in the phase space of our system a 
certain reduced manifold of 2s — k dimensions, over which we 
will have to take the phase averages of the phase functions in 
which we are interested. In the great majority of cases dealt 
with in the statistical physics, the only controllable integral is 
the energy integral, so that the reduced manifold will be only 
the surfaces of constant energy 2, . There are, however, cases 
where, simultaneously with the energy integrals, some other 
integrals become controllable (for instance integrals of the 
momentum components). In such cases the phase averages are 
actually taken over the manifolds of smaller number of dimen¬ 
sions, which are obtained by fixing the values of controllable 
integrals. 

As concerns the remaining free (that is, not fixed) integrals, 
each of them, if it represents an actual physical quantity, will 
be almost constant, in the above described sense, on the 
reduced manifold, as stated in 2. This gives us a certain reason 
to expect that its value will be near its phase average on the 
reduced manifold, in the majority of cases met in practice. In 
other words, we assume that the location of the image point 
of a system on the reduced manifold is a random event such 
that a very small probability corresponds to the location of the 


*Aii excellent example of how radically all the results of computations 
could ^ changed by fixing the value of such an integral, is given by the 
statistical schemes of Bose-Einstein and Fermi-Dirac in quantum physics. 


52 


point in a set of very small measure (absolute continuity!). 
Hence it is almost certain that our integral assumes values near 
to its average, in the majority of experimental measurements. 
Of course, the question of correctness of all this hypothetical 
construction can be ultimately decided only by a comparison of 
the deductions of our theory with the experimental data. 

The fact that the distinction between the fixed and free 
integrals is determined not by their mathematical nature, but, 
so to say, by their role in our scientific or practical experience, 
should not at all disturb a mathematician. It is typical in all 
applications of the theory of probability. For instance, under 
normal conditions we consider the number of tickets drawn in 
a lottery as a random variable. However, if we succeed in 
studying the mechanism of the drawing to such extent that we 
shall be able to determine this number beforehand, or, still 
more, if we succeed in drawing the number as we desire, then 
all elements of randomness disappear, although the mechanism 
of drawing is the same in both cases. 

In what follows we shall consider as a reduced manifold of 
the phase space the surface of constant energy, which corre¬ 
sponds to the actual situation for the majority of systems dis¬ 
cussed in the statistical physics. Thus it will be our purpose to 
collect as many arguments as possible in favor of the fact that 
the time averages of physically most important phase functions, 
for the great majority of trajectories situated on the given sur¬ 
face of constant energy have values which are close to each 
other (and therefore, necessarily, near the values of the corre¬ 
sponding phase averages). 

11. Brief historical sketch. As we have indicated already, 
many authors attempted to prove the coincidence of the time 
and the phase averages by introducing various special hy¬ 
potheses, more or less plausible. Such hypotheses usually were 
called “ergodic hypotheses'^ The first of them was stated by 
Boltzmann, who also was the first to use the terminology. 
Boltzmann, conjectured that each surface of constant energy 
consists of a single trajectory. In other words, no matter what 


53 


is the state of our system at a given time, it will pass (or has 
already passed) through any other state with the same value 
of the total energy. 

Using this conjecture it is possible to establish the coincidence 
between the time and phase averages on each surface of con¬ 
stant energy. However, the conjecture itself is logically contra¬ 
dictory, which soon was found out, and which at present is 
topologically obvious, since no trajectory can have multiple 
points and therefore cannot fill out the whole many 
dimensional space. 

After this failui’e attempts were made for a long time to 
replace the ergodic hypothesis of Boltzmann by the “quasier- 
godic” hypothesis, according to which every trajectory, 
although not filling completely the surface of constant energy 
on which it is situated, constitutes an everywhere densepoint 
set (that is intersects every element of the surface). However, 
even if we disregard the fact that the logical compatibility of 
this hypothesis has not been established, nobody succeeded in 
proving on this basis the possibility of replacement of the time 
averages by the phase averages. Numerous expositions of such 
proofs contain grave mistakes. Those authors (as, for instance, 
P. Kertz in his known treatise) who do not wish to base their 
proofs on false arguments, have to introduce several additional 
hypotheses. 

All this history of the ergodic problem appears to us in¬ 
structive since it makes the efficacy of introducing various 
hypotheses which are not supported by any argument very 
doubtful. As is usual in such cases, when we are not able to 
submit really convincing arguments in favor of replacing the 
time averages by the phase averages, it is preferable, and also 
simpler, to attempt as the ^‘ergodic hypothesis’^ the very possi¬ 
bility of such a replacement, and then to judge the theory 
constructed on the basis of this hypothesis, by its practical 
success or failure. This, of course, does not mean that the 
theoretical justification of the accepted hypothesis is to be 
forgotten. On the contrary, this question remains one of the 
most fundamental in the statistical mechanics. We wish only 


54 


to say that the reduction of this hypothesis to others is little 
justified, and does not appear to us to be very efficient. 

After several decades of almost fruitless discussions in con¬ 
nection with the eig:odic problem, it was only in 1931 that 
the theorem of Birkhoff revived the problem.* From our point 
of view the essence of these new researches consists in em¬ 
phasizing the notion of the metrical indecomposability and 
in establishing its close connection with the ergodic problem. 
Quite often one hears, particularly from the side of physicists, 
that Birkhoff’s results do not give anything for the solution of 
the ergodic problem, but only reduce it to another problem— 
the justification of the metric indecomposability of the surfaces 
of constant energy, and, in this sense, are similar to introducing 
ergodic hypotheses as done by earlier authors. Although we do 
not wish to overestimate the role of Birkhoff’s results for the 
foundation of the statistical mechanics, we cannot share such 
a point of view.** The enormous, interesting, and significant 
literature which has developed on the basis of Birkhoff's le- 
searches during the last decade shows that these researches in 
the ergodic problem shed light on new problems which re¬ 
mained unknown beforehand, and discovered a most fertile 
field for new researches. In all mathematical justifications of 
various special fields usuall}" there occur moments when, 
although it does not solve any concrete problem, the introduc¬ 
tion of some appropriate notions coordinates and organizes the 
whole problem in such a fundamental way that the work of an 
investigator is turned from a chaotic and almost helpless 
wandering into a sensible and planned conquest of new scientific 
facts. There is every reason to believe that the researches of 


*Footnote of the translator. The first form of an ergodic theorem was a 
somewhat weaker statement proved by J. v. Neumann shortly before 
Birkhoff. 

**Footnote of the translator. In fact, a general theorem showing the 
existence of ergodic transformations on quite general manifolds (or phase 
spaces) was proved by I. Oxtoby and S. Ulam. (Ann. of Math. 42, 874 
(1941)). Their results imply that in a certain sense alrruyst every continuous 
transformation is metrically transitive. 


55 


Birkhoff represent such a moment in the development of the 
ergodic problem. In the following paragraphs we intend to state 
some observations concerning the present status of the crgodic 
problem. 

12. On metric indecomposability of reduced manifolds. As 
we know from section 6 of chapter II the metric indecompos¬ 
ability of the given surface of coastant energy iasures that the 
time averages of any summable function /(P) along almost all 
trajectories situated on this surface coincide with the phase 
average / of this function over this surface. But it is easy to 
see that if we require this property of almost all trajectories 
and all summable functions, then the condition of metric in¬ 
decomposability is also necessary. Indeed, if the given siu-facc 
of constant energy is metrically decomposable, then it can be 
split in two invariant sets Mx and M 2 , each of which has a 
positive measure. The summable function tp(P) which assumes 
the value 0 on M, , and the value 1 on M 2 cannot have the 
same time averages along almost all trajectories (the.se time 
averages are either 0 or 1, while the phase average has some 
intermediary value). 

Thus we see that the metric indecomposability of the sur¬ 
faces of constant energy is a necessary and sufficient condition 
for a positive answer to the ergodic problem stated in a certain 
precisely determined sense. This fact alone shows the essential 
advantage of the investigations of Birkhoff as compared with 
the introduction of old ‘^ergodic hypotheses”. 

We now pass on to the question of what general considera¬ 
tions may make the metric indecomposability of surfaces of 
constant energy more or less plausible. Let <p — <p{qx , • • ♦ , p.) 
beone of the “free” integrals of the equations of motion of the 
given system, that is, an integral independent of the integral 
of energy and not containing time explicitly. If the function tp 
remained constant on each surface of constant energy, then the 
value of the integral would have been uniquely determined 
by the value of the integral of energy, and these two integrals 
would not be independent, contrary to our assumption. Hence 


56 


the function ^ cannot remain constant on each surface of 
constant energy, and, being continuous, cannot remain constant 
at almost all points of such a surface. 

Let us take a surface of constant energy on which the func¬ 
tion (p is not constant almost everywhere. Then we can find 
such a real number a that each of the two parts of this surface, 
characterized respectively by the inequalities p > a and 
^ < a, will be of positive measure.® But since p is an integral 
of the equations of motion of our system, each of these two 
paits is an invariant set. This shows that our surface of con¬ 
stant energy cannot be metrically indecomposable. 

This elementary argument leaves an impression that the 
metric indecomposability of surfaces of constant energy is a 
hypothesis, which, like the Boltzmann hypothesis, never can 
be realized, and therefore should be rejected. However, this 
would mean a complete solution of the ergodic problem in the 
negative sense. 

Formally, there are no objections against the above argu¬ 
ment, and we actually have to agree that the answer to the 
ergodic problem at least in the form in which it was formulated 
above, should be in the negative. We shall see however, that if 
we introduce some sensible and natural modifications in the 
formulation of the problem, we may obtain a positive answer, 
at least in some cases. 

So far it was always self-evident, although not stated ex¬ 
plicitly, that two distinct points of the phase space represent 
two distinct states of our mechanical system. Actually, how¬ 
ever, in many cases, to distinct points of the space F may 
correspond identical states of the mechanical system. 

Let us explain this. In many cases we are forced to charac¬ 
terize the same physical state of the system not by one, but 
by several sets (sometimes even by infinitely many) of values 
of its d 3 Tiamic coordinates. Thus for a point which moves uni¬ 
formly along a circumference, if we determine its position by 
the central angle counted from some fixed radius, we must 


•For the proof see the footnote on p. .*^0 


57 


consider as identical the states for which the values of this 
angle differ by a multiple of 27r. 


On the other hand it is obvious that 
which characterizes the state of the 


every physical quantity 
given system must be 


determined uniquely by this state. The phase function which 


interprets this physical quantity in our theory, must therefore 


assume the same value at any two points of the phase space 
corresponding to the same state of the given system. We shall 
call normal every phase function which satisfies this condition. 


Since, in view of the preceding considerations, all physical 


quantities for which there may arise the question of comparison 
of their theoretical values with experimental data are inter¬ 
preted by normal phase functions, we will lose nothing if in 
formulating the ergodic problem we shall state that not all 
but only normal summable functions should satisfy its require¬ 
ment. Then the condition of metric indecoraposability will 


cease to be necessary; it will be replaced by a broader necessary 
and sufficient condition which can be easily formulated. 

W'e shall call normal every subdivision of the given surface 
of constant energy in two invariant paths of positive measure, 
such that all points of the surface which correspond to the same 
state of the system (we shall call such points physically equiva¬ 
lent) belong to the same part of the surface. A surface which 
does not admit of a normal subdivision we shall call metrically 
indecomposable in the extended sense. 


Theorem. In order that time averages of any normal summable 
function taken along almost all trajectories situated on the given 
surface of constant energy, would coincide with the phase average 
of this function over the given surface, it is necessary and sufficient 
that this surface is metrically indecomposable in the extended 


sense. 


A proof of this theorem can be carried through in complete 
analogy with the argiunents of section 6 of the preceding 
chapter (sufficiency) and of the beginning of the present para¬ 
graph (necessity). We leave it to the reader. 

After this modification the ergodic problem is reduced to the 
question of whether, generally speaking, the surfaces of con- 


58 


stant energy of the mechanical systems under consideration are 
metrically indecomposable in the extended sense. First we 
shall show that the argument which was used above in estab¬ 
lishing the impossibility of the metric indecomposability in the 
original sense, does not give anything directly if we assume 
the metric indecomposability in the extended sense. 

Indeed, in this argument we subdivided the given surface 
of constant energy into two parts, placing a point in the one 
or the other of these parts according to the value assumed at 
this point by a certain integral But if now we are interested 
only in the normal subdivisions associated with a given point 
all physically equivalent points must belong to the same part. 
If ^ is a normal integral, that is, assumes the same value at 
all physically equivalent points, our argument remains valid. 
But if our integral tp is not normal, then, in determining the 
sets Mx and Mi we cannot start by arbitrarily subdividing the 
set of all values assumed by the integral p in two parts. If 
we want the subdivision (il/, , M^) of the surface S* to be 
normal, we must see to it that the values assumed by p at any 
two physically equivalent points are always placed in the same 
part. This requirement (as we shall see in an example) may 
turn out to be incompatible with the requirement that Mi and 
Ml be invariant sets of positive measure. In such a case our 
argument becomes invalid, and the question of possibility of 
metric indecomposability in the extended sense remains open. 

Later we shall give the simplest of known examples of such 
a situation. Now we note that the above argument shows the 
impossibility of metric indecomposability even in extended 
sense, if among the free integrals there exists at least one normal 
integral. In particular, the energy integral being always normal, 
necessarily has to be fixed. If the system has no other normal 
integrals of motion, then we can raise the question of the metric 
indecomposability in the extended sense on the surfaces of 
constant energy. 

Let us turn now to an example of the metric indecompos¬ 
ability in the extended sense. Consider a system with two 
degrees of freedom, whose situation is determined by two cyclic 


59 


coordinates «p, \p, with a period 1. This means that for any 
r integers k and I, the pair of coordinates ^ ^ + Z represents 

the same situation of the system as the pair <p, rp (motion of a 
particle on the surface of a focus). The Hamiltonian function 
we take to be 

H = (l/2W^ + r"), 

where (p'y are dynamic coordinates canonically conjugate to 

yp. If we denote by a dot the differentiation with respect to 
time, we can write the canonical system of the equations of 
motion in the form 

<p = ip', ip = yP', ip' = 0, ip* = 0. 

Three independent integrals which do not contain time ex¬ 
plicitly are given by the functions 

’/''i *pP' ~ W- 

The first two of these integrals are normal (since two phy¬ 
sically equivalent points can differ by integer values of the 
variables p and \p but p' and yp' will have for them the same 
values). The third integral is not normal. Indeed, let I be 
the value of this integral for some state of the system {p, yp, 
P*y yp'). For any integers k and I the point ip + k, yp I, 
p', yp') represents the same state of the system, but the value 
j, of the third integral at this point is / + kyp' — Ip', which in 
general is different from I. Furthermore, if the values p' and 
yp' are incommensurable, the third integral assumes an every¬ 
where dense set of values for the same state of the system. 

According to what was said above, if we desire to construct 
a reduced manifold metrically indecomposable in the extended 
sense, we have to fix the values of the integrals p and \p'. 
Let p' — a, yp' = where a and are any two incommen¬ 
surable real numbers. The reduced space {p, yp) will be then a 
two-dimensional part of the four-dimensional phase space, and 
% will be also a part of the surface of constant energy E = 
(l/2)(a“ + /3^). Let us investigate whether this plane (p, yp) 
admits of normal decompositions. 


60 


Since every square 

(22) k<ip<k-\-\, ^<^<^4-1, 

where k and I are any integers, is physically equivalent to any 
other such square, and since in a normal subdivision all 
physically equivalent points must belong to the same part for 
each normal subdivision of the plane {<p, yp) all such squares 
will be subdivided in parts which are mutually congruent. 

A little explanation is necessary here; the sets Mi and M 2 
which normally subdivide the plane (^, yp) clearly will have 
infinite measure, which is not foreseen by the definition of a 
normal subdivision. In our case, however, this cannot cause 
any difficulty, since, in view of the physical equivalency of any 
two squares of the type (22), to take the phase average of any 
normal phase function we could restrict ourselves to the con¬ 
sideration of the fundamental square 0<(p<l,0<^<l. 
If we transfer any normal subdivision of this square to all 
squares (22), we obtain a subdivision of the plane (^, ^), which 
naturally may be called a normal subdivision of the plane 

(v?i • 

Let {Ml , Mj) be any such normal subdivision. Consider the 
side ^ = 0 of the fimdamental square. Let the point (0, h) 
(0 < 6 < 1) of this side belong to the set Mi . We assert that 
in this case the point (0, p(6 -f- k /3/a)),* where k is any integer, 
also belongs to Mi . Indeed the set Mi , being invariant, con¬ 
tains together with the front (0, h) the whole trajectory passing 
through this point, that is, all points of the type {at, b + 01), 
and, in particular (for t = k/a) the point {k, b + k0/a). 
But this point is physically equivalent to the point (0, p{b -b 
k0/a)) of the fundamental square, and therefore, since our 
subdivision is normal, must also belong to Mi . The numbers 
p{b -h k0/a){— 00 < A: < 00 ), if « and /3 are incommensurable, 
constitute an everywhere dense set, hence the set Mi , is every¬ 
where dense on the side ^ = 0 of our square. Let Ai be the set 
which is common to Mi and this side. We can assert that if Ai is 

*Here p(.x) = x — fx] is the "fractional part” of the real number x. 


61 


measurable, it must have positive (linear) measure. Indeed, if 
the measure of Ai were equal to zero, then the part of M* in 
our square which consists of the family of parallel lines of slope 
j3/a, passing through the points of Ai would have been of 
plane measure zero. In view of the mutual congruency of sub¬ 
divisions induced in all squares, we would have = 0 , 

contrary to our assumption. 

Let e > 0 be arbitrarily small, and 5(6i , 62 ) such an interval 
on the side = 0 of the fundamental square, in which the 
average density of the set Ai is greater than (1 t), that is 

m^'Ai) > (1 - €)m 

(such an interval exists in view of a known theorem concerning 
the density of measurable sets). After the time t — kfa the 
interval 5 passes over into an interval of the same length on 
the line <fi = k, which, in its turn, is equivalent to some interval 
(or a pair of intervals) 5' of the side ^ = 0 of our square, and 
we must have 

m' = 


Since this set Mi is invariant, 


mS‘Ai) = m&'-Ai) 

so that 

5m(5'-Ai) > (1 - €)3n5'. 

Varying k we obtain, as it is easy to see, along the side <p — 0 
of our square a dense (because of the irrationality of 0/a) 
set of intervals &' of equal length within which the mean 
density of the set Ai exceeds 1 — It follows that: 

SWA, >1-6 


or, because of the arbitrary value of t: 





Writing Aa for a set complimentary to A, (on the side ^ = 0) 
we have: 


SRAa - 0. 


62 


As we have already seen this would lead to = 0 

which contradicts our assumption. This argument shows that 
the plane (^i , is metrically indecomposable in the general 
sense of this word. The que.stion as to whether this metric in- 
decomposability can be considered as the general property of 
the broad class of systems encountered in statistical physics 
cannot be answered at the present time. We notice however 
that many other authors succeeded in the construction of 
rather general examples of the above given type and gave 
arguments in favor of the generality of the above statement. 
We will not enter here into the discussion of these problems, 
but will turn to the analysis of those cases which are of greatest 
importance in statistical mechanics. 

13, The possibility of a formulation without the use of 
metric indecomposability. All the results obtained by Birk- 
hoff and his followers (as well as all considerations of the pre¬ 
vious section) pertain to the most general type of dynamic 
systems, and consider different problems connected with them. 
The authors of these studies have been working, as a rule, on 
the development of the so-called “general dynamics”—an im¬ 
portant and interesting branch of modem mechanics. They 
have not been interested in the problem of the foundation of 
statistical mechanics which is our primary interest in the 
present book. Their aim was to obtain the results in the most 
general form; in particular all these results pertain equally to 
the systems with only few degrees of freedom as well as to the 
systems with a very large number of degrees of freedom. 

From our point of view we must deviate from this tendency. 
We would imnecessarily restrict ourselves by neglecting the 
special properties of the systems considered in statistical 
mechanics (first of all their fundamental property of having a 
very large nmnber of the degrees of freedom), and demanding 
the applicability of the obtained results to any dynamic sj^tem. 
Furthermore, we do not have any basis for demanding the 
possibility of substituting phase averages for the time averages 
of all functions; in fact the functions for which such substitution 


63 


is desirable have many specific properties which make such a 
substitution apparent in these cases. In the present section we 
shall make several elementary remarks along these lines. 

In the field of statistical mechanics we are, first of all, 
helped by the fact that the majority of phase functions de¬ 
scribing the most important physical quantities exhibit a very 
peculiar behavior (compare section 10). In fact these functions 
are, as a rule, approximately constant on the surfaces of con¬ 
stant energy, i.e., with the exception of a set of points of a 
very small measure, they possess on each such surface values 
which are very close to a certain number characteristic of the 
surface. The reasons for such peculiar behavior will be partially 
discussed later in this chapter, and we will return to them in 
more detail in the later chapters. We will remark here, how¬ 
ever, that these reasons arise partially from the peculiar prop¬ 
erties of mechanical systems treated in statistical physics 
(breaking up into a large number of components), and partially 
from the specific properties of the functions with which we are 
dealing (these are, as a rule, the “sum-functions”, i.e., the 
sums of functions each depending on the dynamical coordinates 
of only one component). It is clear without calculation that, 
for such functions, the time averages taken along most tra¬ 
jectories must be very close to the corresponding phase aver¬ 
ages. If derivable, however, the approximate proof of the above 
statement can be given along the following lines: 

Let us assume that the values of the function f{P) on the 
surface 2^ (except in a set of points of a very small measure) 
are very close to a certain number .-1. Then, unless/(P) assumes 
at this small set of points some particularly large values, the 
quantity 

will in general be small. Assume for simplicity A = 0, which 
k apparently does not reduce the generality of our considerations. 
Assume also, as we did before, that 


64 


j; f f(P, t) dt = fa{P), lim Sc{P) = AP); 

*'0 C-® 

finally let Ma be a set of points on the surface 2^ for which 
I I > Q!, and Ma a set of points for which | fc{P) | > «/2. 
Since under the condition C (on the surface 2^), fc{P) 
fiP), we can write for sufficiently large C: 


WM^a > mMa * 


consequently: 

amM„ 


< 


/... 11 rS ^ 


h r dt [ 1 f{p, t) 

^ Jq J mc^ 

1 d2 

grad E 

^ f dt I 1 /(P) 

, d2 

grad E 


^ d f. * /., I«« 


d2 


grad E 


= im, 


from which follows; 


n(a) ^ a' 

if for example we choose a = we obtain: 


12(a) 


< 4(7)*"=* 


*To prove this inequa lity assume that is a set complementary to Ma • 

If P G Ma and P G M^ , we evidently have: | fc{.P) — f{P)\ > «/2- ‘ 
Hence, beca use o f convergence in measure under the condition C ^ , we 
have 912(M„-A/S) 0> or 2)2(Af„*Af£) —+ SERA/a .From this it follows that, 

for sufficiently large C, ^{Ma'MD > , or o foHioH mM^ > 

h^Ma . 


65 


so that the relative measure of the set of points on the surface 
2a for which [ /(P) j exceeds the small quantity is smaller 
than the small quantity 4(7)It is clear that in order to 
reach the practical conclusions from the above calculation, we 
must estimate in each particular case the order of magnitude 
of the small quantity I. In many cases such an estimate is 
actually possible. However, it is also possible to make some 
estimates of quite a general nature. Thus, for example, we will 
see in the following chaptei'S that for a physical system formed 
by n molecules, the most important phase functions are of the 
order of magnitude n. The ‘dispersion” of such a function, 
i.e., the quantity 

'■ - 55) /.. '«'■> - iiSs’ 

has also, as a rule, the order of magnitude n (Chapter VIII, 
Sec. 36). Since, because of the Schwartz inequality 7 < 
(7/)i/3 _ o(n^^*), we find, choosing a = 7®^^ (order of mag¬ 
nitude that the relative measure of the set of points for 
which 

I /(P) - A I > Kn 
or 

- 1 

is a small quantity of an order of magnitude not less than 
I/a = 0(n“*^*) {K and Kx being positive constants). Sincejhe 
quantity A can be assumed to represent the phase average / of 
the function /(P) on the surface 2a , the above considerations 
supply certain approximate qualitative estimates pertinent to 
the substitution of phase averages by time averages. These 
almost trivial considerations lead us to suppose that, at least 
in the fundamental problems of statistical mechanics, and 
especially for practical purposes, we can avoid the use of the 
ergodic theorems of general dynamics. 

We make one more remark. In this as well as in the previous 
sections, we have been satisfied with such situations in which 



> if.n-' 


66 


the desirable phenomenon was taking place at all points of the 
surface except at a set of points with a very small measure 
(sometimes exactly zero). It is clear that, taking this point of 
view, we make the definite assumption that, if some collection 
of the states of the system is represented on the surface 2* 
by a set of points of a very small relative measure, then the 
states belonging to this collection appear very infrequently in 
practice. 

The exact mathematical formulation of this assumption in 
terms of the theory of probability is as follows: considering the 
different states of the system (i.e., of the points of the surface 
2a) as random events, we assume that they are subject to any, 
not necessarily absolutely continuous, distribution law (i.e., 
such a distribution law for which the collection of very small 
measure possesses a very small probability). Such an assump¬ 
tion is in fact absolutely unavoidable in any comparison of 
our theory with reality. As a working hypothesis it is quite 
natural and gives us a free hand in selecting free distributions 
appropriate in practice. 

Let us consider, finally, one more simple argument pertinent 
to the same group of ideas. Let us call a summable_function 
f{P) ergodic if, for almost all trajectories, f{P) = /. As re¬ 
marked before, most of the phase functions considered in 
statistical mechanics are of the “summable type” i.e. represent 
the sum of functions each one depending only on the coordi¬ 
nates of a single molecule. It is clear that such a function will 
be ergodic if each of its components is also ergodic, since the 
averages / and / are both linear transformations. Thus, to 
prove the ergodic nature of such functions, it suffices to prove 
it for the functions corresponding to single molecules. We will 
give now some considerations in favor of the above statement. 

Let f{P) be a function of coordinates of only one molecule. 
Without restricting the generality of the argument, assume 
/ = 0. Let M be the upper limit of the function | /1 on the 
surface 2«, and 


Df = r(P), 


67 


R{u) = f{P, t)f{P, t + u) 

(it is apparent that the last quantity is independent of 0* 
Thus, Df is the phase dispersion of the function /(F), and 
R{u) is the phase coefficient of correlation connecting /(F, t) 
and /(F, i + w). Because of the fact that the given system 
consists of a very large number of molecules it is natural to 
expect that knowledge of the state of a single molecule at a 
certain moment does not permit us to predict anything (or 
almost anything) about the state in which this molecule will 
be found after a sufficiently long time. For example, the exact 
knowledge of the energy of a given molecule at a given moment 
of time cannot give us any indications concerning the value 
which this energy will have several hours later (because of the 
large numbers of collisions suffered by the molecule during this 
time interval). This statement seems to us so natural that it 
would be difficult to think otherwise; in fact, this represents the 
basic idea of “molecular chaos”. Expressing this in terms of the 
theory of probabilities we can say that stochastic dependence 
between the quantities /(F, i) and /(F, t + u) decreases very 
rapidly with increasing w, and almost entirely vanishes for 
appreciably large values of u; in particular it means that R{u) 
must be small for large u and that F(a) ^ 0 for u —>«>. The 
only bad point of the above argument which must be mentioned 
> here is that, since R{u) is the phase coefficient of the correla¬ 
tion, we cannot be sure to what extent it can be used to 
characterize the stochastic dependence between the quantities 
in question. However, the relation R{u) —> 0 (u —><») represents 
a well defined property of the function /(F) and of the natural 
motion on the surface 2., —a property which must necessarily 
take place if the initial correlation between the quantities 
/(F, /) and /(F, i u) becomes, “generally speaking”, nearer 
with increasing u. The expression “generally speaking” attains 
exact meaning in terms of the measure on the surface 2^ , and, 
% as we have seen above, the stochastic interpretation of this 
measure represents the necessary postulate of the entire theory. 


68 


From this point of view it is interesting to prove the fol¬ 
lowing theorem: 

Theorem: If R(u) —> Ofor u —the function fiP) is ergodic. 
Proof : Assuming (as usual) 

lim r f(P, t) dt = AP) 

Jo 

we have 

- da LJh 

-miiSE {/w - ? /." f «'’■ •> ■*} 



/ 


dX 


iiW •' 2 . grad E C* 


✓ ♦'ft Jft 


Writing Q for the above expression in brackets, and G, and 
(?, respectively for the set of points on the surface for which 
I Q I < 6 and its complementary set 


f\p) = 


n(a) c 


if I 


du dv 


L 


KP, u)f(P, V) ds 
grad E 




/, 


Qd2 


+ 


1 


Jo, grad E fl(a) Je, grad E 


L 


QdX 


Since 

Q = f(P) - f f(P, t) df 


we conclude, that under the condition C —►», Q —» 0 almost 
everywhere on the surface 2« . Consequently SWf?* 0 from 
which follows that for sufficiently large C: 

TIG. ^ _ 1 _ f dZ 
fi(a) S2(a) Jo. grad E ^ 


69 


Since obviously 


- f 

n(a) Jz, 


f(P, u)f(P, v) dZ 


grad E 


= Df R(u — v) 


and 


Ql 


{p e Gd 


Q\<M^ {PE (?.) 


we conclude that: 


f\p) < 


ni 


f f Riu - 

Jo Jo 


v) du dtv 


+ € + M\. 


If 1 i2(a) I < € for 1 a 1 > we get (taking into consideration 
that I /2(a) 1 < 1 for any a) 


np) < f r / 


9n«fi(C*u4a«) 


R{u — v) dv 




+ ^ j J du du 4- € + A/% 


0 •'0 


< + <Df + 1 + M'O- 


Since we can choose < arbitrarily small and C arbitrarily large 


np) = 0 

so that almost everywhere on S» 

/(P) = 0 = 7. 

This proves the ergodic nature of the function of /(P) 




CHAPTER IV 


REDUCTION TO THE PROBLEM OF THE 

THEORY OF PROBABILITY 

14. Fundamental distribution law. We shall now consider 
the aggregate of 2 s dynamic variables (xi ^ X 2 y • • • , 3 : 2 ,), 
determining the state of a given system G with s degrees of 
freedom, as a multidimensional probability quantity (prob¬ 
ability vector). We shall assume as usual that the energy E of 
the system has a certain constant value a, so that all possible 
values of the probability vector (xi , X 2 , • ♦ • , x^,) correspond 
to the points on a certain surface 2,, . The probability that the 
representative point of the system will fall within a certain 
set M on the surface 2 ^ will be assumed to be given by: 

1 f dZ 
i2(a) Jm grad E ' 

where the value Q(a) = Jz. d2/grad E of the surface 2, is the 
structure function of the system. It is obvious that the prob¬ 
ability field introduced in this way satisfies all necessary 
conditions. The distribution law of the probability vector 
{Xi y X 2 , • • • , Xa.) thus established will be called in the future 
the fundamental distribution law of the system {for E = a). 
Let <p{xi f X 2 } * * * , X24 ) be an arbitrary measurable phase 
function of the system G. Then the probability of the inequality 


<p(Xi , X2 , , X2,) < X, 

where x is an arbitrary real number, will be determined by the 
formula 


P(<p < x) 


1 r dz 

K<* grad E ’ 


Thus, any measurable phase function can be considered as an 


70 


71 


accidental quantity with a well defined distribution law. The 
mathematical expectation of this quantity, 



’ grad E ’ 


coincides (under the assumption of absolute convergence of this 
integral) with the quantity which we called the phase average 
^ of the function ip in section 7 of Chapter II. In particular, 
if ^ is an ergodic function, its time average for almost any 
trajectory on the surface 2^ coincides with its mathematical 
expectation 

If ^ is a function characteristic of a certain measurable set 
of points M on the surface 2^ (i.e. if ^ = 1 within M and 
^ = 0 without), Ev> obviously gives the probability of finding 
the point (xi , , • • • , Xa.) within the set of points M. If ^ 

is an ergodic function this probability coincides with the 
relative mean time spent by the moving point within the set 
M for almost any trajectory located on the surface 2^ . 

The fundamental distribution law formulated above permits 
us to introduce convenient probabilistic terminology for the 
ideas connected with evaluation of phase averages. At the 
same time, as we shall see later, this formulation of the fimda- 


> 


mental distribution law will permit us to use the well known 
analytical apparatus of the theory of probability for the solu¬ 
tion of many fundamental problems in statistical mechanics. 


15. The distribution law of a component and its energy. 
Let a given system G have a component G, with dynamic co¬ 
ordinates (xi , a :2 , • • • , Xr) (the complementary component 
C ?2 having dynamic coordinates Xr*\ , • • • , X 2 ,). The funda¬ 
mental distribution law assumed for the system G, i.e. for the 
multidimensional random quantity (xi , • • • , X 2 ,), uniquely 
determines, according to the well known rules of probability, 
the distribution law for the arbitrary group of dynamical 
variables in the system (7, . 

In particular, the set of variables (xi , X 2 , • • • , x^) (r < 2^) 
or, as we shall say for brevity, the component G^ is subject to 


72 


a definite distribution law in the space of r dimensions, which, 
of course, coincides with its phase space. Let us now find this 
distribution law. 

Let Mi be a measurable set in the phase space Fj of the 
component Gi , in which (xi , , • ♦ • , Xr) serve as Cartesian 

coordinates. Further, let M be a set of points in the phase 
space r of the system G for which the first r coordinates 
represent a point in the space Fi belonging to the set Mx (so 
that the point (x, , • • • , Xr) will belong to the set Mi , when 
and only when the point (x, , • • • , Xa.) belongs to the set M). 
The probability that the representative point Pi of the com¬ 
ponent Gi falls within the set Mi coincides with the probability 
that the representative point P of the component G falls 
within the set M (both probabilities being determined, as 
usual, under the assumption that E = o), so that we have 
(denoting by the intersection i.e the common part of the 
sets A and B) : 

w, enj-PiFem-^ jje 

fi(a) A, ^ grad E ’ 

where ^ stands for the previously defined function charac¬ 
teristic of the set M. Because of the general theorem (section 
7, Chapter II) this gives us: 

(23) P(P. e a . iy. 

where dV stands for the volume element in the phase space F 
of the system G i.e. 

dV = dxi • • • dx2, . 

Since the function is independent of the variables x^+i , * * * » 
Xa, we conclude that 

/ <pdV=( ^dVi f(V^.ir.hdV2, 

Jyi J 


73 


where dVi = dxt dXr , rfVa = dxr + i ••• dx 2 , , Ex — 
Ex{xx , ••• , Xr) and E^ = E^ixr^x , **• , 2 ^ 2 .) represent 
the energies of the components Gi and G 2 , whereas (Fa_s .)2 
is the set of points in the space for which S 2 < a *— . 

The outer integration can be extended through the entire space 
Fi without changing the results since for Ex > a the inner 
integral vanishes. 

In the above expression the inner integral represents the 
volume of that part of the phase space Fa of the component 
Gi where E 2 < a — Ex . Denoting it as usual by ^ 2(0 — Ex) 
we will have: 



consequently: 




f V2(a-Ex)dV, 


d 

da 



f Q 2 (a - Ex) dVx 

J Ml 


since it is clear from the definition of the structure function 
that Viix) coincides with the structure function ^ 2 ( 3 :) of the 
component G 2 • Thus the relation (23) gives us 


(24) P(P. e ^.) = ^ «2(a - -B.) dV, . 

It must be remembered that in the above expression dVx ~ 
dxi • • * dXr and Ex is the function of a:i , • • • , x, . 

We see that in the case where the energy of the system G 
is equal to a, the distribution law of the component Gi in its 
phase space Fi is given by the density function 



1^2(0 Ex ) 

m ’ 


where is the structure function of the complementaiy 
component G* . This fact permits us to write the expression 
for the phase average of any function depending on Xj , • • • , 
Xr , in the form of an integral extended over the space Fi . 

Indeed, let <p(xx , • • • , x,) be such a function. We know that 


74 


its phase average <p coincides with the mathematical expecta¬ 
tion E(p which, according to the above results, can be written 
in the form 

(26) ^ — 'E<p = J (^^ 2 ( 0 ^ — El) dVi . 

The most important function of the above type is the energy 
El — Ei{Xi , • ■ • , Xr) of the component Gi . Because of (26) 
we have: 

E, = EE, f E,(l,(a - E,) dV, . 

i](a) Jr, 

However, because of the particular impcHi^ance of the quan¬ 
tity El , we will not limit ourselves by establishing only its 
mean value, but we will also find its distribution law. 

We have seen that the aggregate (xi , • • • , x,) of the dy¬ 
namical coordinates of the component Gi is a multi-dimensional 
random quantity distributed in the space Fi with the density 


^2(0 -^i) 

S2(a) 

Accordingly, the probability that Qi < Ei < g 2 is given by: 


P(^i < El < § 2 ) — 


— f 


Ojfa — El) dVi . 


According to formula (18) Chapter II, this multiple integral 
in the space of r dimensions can be written in the form of a 
simple integral 


r 


fli(Ei) 122(0 — El) dEi 


which brings us to the relation 

P(ffi < El < § 2 ) = 12,(x) 122(0 - .r) dx. 


75 


Thus the random quantity Ex is subject to probability density 



Ui(x)U2{0' — x) 
fi(a) 


This permits us to express the phase average of any function 
*p(Ei) of the energy of the component Gi in the form of an 
ordinary integral: 


(28) 


ifi = E<p{Ei) = j ^{x) 


ili(x)U 2 (a — x) 

S2(a) 


dx 


In particular: 


(29) El = EEi = J xi 2 i(x)Q 2 (a — x) dx. 

In the last two formulae the integrals can be taken between 
infinite limits; in fact, since the integrated function is different 
from zero only for 0 < x < a there is no divergence difficulty. 

In applications we will usually encounter phase functions 
which depend on the dynamical coordinates of some com¬ 
ponents of the given system, and include essentially the energy 
of this component. As we have just seen, the distribution law 
for the energy of the given component as well as for its dy¬ 
namic variables contains the structure functions Qx , and 
Q 2 . (It may be noted that the general formulae determining 
the mean values of an arbitrary phase function on the surface 
also contain the quantity fi(a)). Thus it is clear that any 
analytical method of deriving the approximate formulae for the 
mean values of the phase function used in statistical mechanics 
must first of all give convenient approximate expressions for 
the structure functions. Accordingly, in our approach to the 
problem, we will try to use the fact that the systems usually 
considered in statistical mechanics consist of a very large 
number of similar components. Using the methods of the theory 
of probability we will be able to establish for the structure 
functions of such systems the approximate expressions which 
are to a large extent independent of the nature of individual 
components. 


76 


16. Generating functions. Let ns consider a system G whose 
structure function Q(x) is subject to the usual conditions: it 
is positive and monotonically increasing for x > 0, it is equal 
to zero for x < 0, it is continuous and increases without bound 
for X —> 00 . However, we will require the integral 

(30) <E>(a) = J dx 

to converge for any a > 0 (it may be remarked that this 
condition is satisfied in all actual physical problems). 

In the future we will call the function ^(a), (which is 
nothing else but the ^‘Laplace transform’^ of the structure 
function fl(x)), the generating function of the system G, because 
of its fundamental role in our analytical method. For the same 
reason we will discuss in more detail the fundamental prop¬ 
erties of such generating functions. 

Each generating function is completely determined for all 
positive values of its argument by the expression (30); only 
this case will be considered. From the definition of the gene¬ 
rating function it follows that: 

(1) ^(a) is a positive and monotonically decreasing function of a. 

(2) ^*(a) — >00 for a —> 0. 

i arthermore, it is easy to prove that: 

(3) For any a > 0, 'S*(a) has derivatives of all orders. For n = 

1, 2, • • • : 

(31) 4>‘"’(a) = (-1)" j x'e-"Q{x) dx. 

In fact, for any positive number ao , and for any large 
number n, we have, for sufficiently large x and a > ao : 

xV“* < c****^^**'* = e"**’*^*. 

From this follows that the integral in the expression (31) 
converges uniformly for a > a© . 

We also notice that, since ^(a) is always positive, the function 
log #(a) also possesses all the properties mentioned above 


77 


(except, of course, being positive); in particular any generating 
function has logarithmic derivatives of all orders. 

(4) The second logarithmic derivative of the function 4>(q') is 
always 'positive for a > 0. 

In fact direct calculation shows that: 

d^ log 5>(a) $(a)4>"(o£) — 

d<x^ ~ [«^(a)]" 


- ife / (' + !«)’■■■■”« 

From this follows: 

(5) The equation 

-^ = a 

4>(a) 

has one single positive solution for any a > 0. 

In fact consider the function 

= e‘“'4>(a). 


Because of the property (2) 4>a(a) —>«> for a —> 0, and since 


e-“Q(i) dx > e“-'^ 


a/2 


fl(x) dx, 


we can conclude that + <» also for a —> 00 . It is also 

apparent that the function log 4>«(a) possesses the same prop¬ 
erties. However, log 4>a(a) is convex function since its second 
derivative, coinciding with the derivative of log 4»(a), is always 
positive because of the property (4); this shows that the func¬ 
tion log 4».(a), becoming infinite for a —> 0 and «—><», must 
necessarily possess a single minimum. In the point of minimum 



which proves our statement. 

The most important property of generating functions is their 


78 


law of composition, i.e. the law by which the generating func¬ 
tion is constructed from the generating functions of its com¬ 
ponents. Let the system G consist of two components G\ and 
G 2 with the structure functions f2i(x) and fi 2 (x) and the gene¬ 
rating function $ 1 ( 0 :) and <i> 2 (a)- Since, according to the for¬ 
mula (20) section 8, Chapter II, 

fi(x) = J fi,(x - 2/)n2(y) dy, 

we have 

4>(a) = J dx = j dx j f2i(x — y)^ 2 {y) dy 


= / dy I 


J2.(x - dx 


= j dy j Qi{z)e dz = 4>.(a)4'2(a) • 

It is clear that, using the method of mathematical induction, 
we can generalize this result for the functions consisting of 
many components. Thus we come to the following rule: 

(6) The generating function of a system G is equal to the product 
of the generating functions of its components. 

Thus, for example, if G is a gas, consisting of n identical mole¬ 
cules and if (p{a) is the generating function of a single molecule 
we have 

Ha) = Ma)T. 

If G is a mixture of two gases consisting of ni molecules with 
the generating function ^i(a) and nj molecules with the gene¬ 
rating function ^ 2 (a), we have: 

Ha) = Ma)r[<P2{a)r 

etc. 

Thus we see that for the composite mechanical system, 
generating fimctions are subject to a much simpler composition 



79 


law than the structure functions. This particular property of 
generating functions makes them particularly convenient for 
the study of systems consisting of a large number of com¬ 
ponents. 

We may also remark that, on the basis of the general formula 
(19) Chapter II, the generating function $(a) of the system 
G can be expressed in the form 

4>(a) = J e-°^dV, 

where E is a, total energy of the system G considered as a 
function of coordinates in the phase space T. 


17. Conjugate distribution laws. Consider again the sys¬ 
tem G discussed in a previous section and use the same notation 
as before. Assume 



Since, 








(x > 0),| 

(x < 0).. 


U'‘\x) >0, J U'-’(x) dx = 1, 

we conclude that represents a probability density for 

any a > 0. For different a*8 we obtain an entire family of 
distribution laws. It is clear that this family is completely de¬ 
termined by the structure of the system G. We will call these 
the family of distribution laws conjugate with the system G. 

Conversely, the structure function n(x) can be found from 
any of the conjugate distribution functions by means 

of the formula 


(34) n{x) = Ha)e^‘U^^\x). 

The mathematical expectation and the dispersion of a quantity 
distributed according to the conjugate law can be 


80 


expressed in a simple way through the generating function 
^(ct) and its derivatives. In fact: 

a = j x[/‘“’(x) dx = xe-"fi(x) dx 

(35) 

_ _ _ _ d log #(q:) 

4>(a) da 

Remembering the property (5) of the generating functions we 
deduce an important theorem: 

Theorem: For any 'positive number a, one can always firui 
among the conjugate Junctions only one function which 

has mathematical expectation a. 

Furthermore, the dispersion corresponding to 's 

given by the expression: 

I (x - ayu'‘\x) <(* = [/ x^U‘"\x) dx] - o’_ 

- da [/ *’'■"”<*> ' - 

(36) 

_ ^(aW'(a) - [^'{a)] 

[^(«)]^ 

_ log ^(g) 
da* • 

Finally, the composition law of the structure functions 

fi(x) = J n,(x - y) 02 (y) dy 

together with the expression (34), and the composition law of 
the leading functions (where U\'^\x) and U 2 °^{x) represent 
the conjugate distribution functions of the components Gx and 
Of) give us: 


81 


= <*..(«)$,(a)e“' f Ul"\x - y)Ul’\y) dy 
= 4.(a)e“ f Ul‘’(x - 2/)C/5“’(y) dy. 


from which follows: 

(37) = f Ul"’(x - y)Ui‘‘'(y) dy. 

Using the method of mathematical induction, we can gene¬ 
ralize this important formula for the case where the system 
G consists of any number n of components with conjugate dis¬ 
tributions • • • , We have 

This is the composition law, well known in the theory of 
probability; it allows one to express the distribution of the 
sum of n independent random quantities in terms of the 
probability densities of individual components. We are led to 
the following rule concerning the composition of the conjugate 
distributions: The conjugate dislrihulion law of a given system 
can be derived from the corresponding distributions of its n com¬ 
ponents in the same way as the distribution of a sum of n mutually 
independent random quantities is found from the distributions of 
individual terms. 

It is clear that the value of the parameter a is quite arbitrary, 
but must be the same for all systems in question. 

18. Systems consisting of a large number of components. 
Consider a system G consisting of the components (?, , (r*, • • • , 



82 


G„ , where n is a very large number. According to the formula 
(34) of the previous section: 

Q(x) = 

We will try now to obtain a convenient approximate expression 
for the function The above expression for this function 
contains, apart from the elementary function e***, the gene¬ 
rating function $(a) and the conjugate distribution function 
As can be easily seen and as we shall soon prove, the 
presence of the function 4>(a) does not lead to any difficulties 
because of the extremely simple composition laws (section 16) 
governing the generating functions; in fact we have already 
seen that 4>(a), being independent of x, plays the role of only 
a constant factor in the expression for the function fi(x). Thus, 
the principal difficulty in our problem consists in finding a 
convenient approximate expression for the conjugate distri¬ 
bution function U^°\z). 

We are helped here by the analytical methods of the theory 
of probability. In the previous section we have seen that U “ C^:) 
represents the probability density for the sum of n random 
quantities (n being in the present case a very large number). 
For such cases the limit theorems of the theory of probability 
supply us with simple, convenient, and rather exact analytical 
approximations, the form of which does not depend on the 
special nature of the laws governing the separate components. 
These laws have only a small number of parameters entenng 
into the approximate expressions. Thus, not having detailed 
information concerning the structure of the separate com¬ 
ponents of the system G and basing our conclusions exclusively 
on the very large numbers of these components we can amve 
at important conclusions concerning the properties of this 
system. This result is typical of any application of the theory 
of probability and demonstrates its principal advantage in the 
study of mass phenomena. 

Let us remark that the value of the parameter a still re- 



mains entirely arbitrary, so that we can use this extra degree 
of freedom for the simplification of the future calculations. 

Thus, rather than creating the special analytical formalism 
for the purposes of statistical mechanics we plan to use in all 
future calculations the conventional formalism of the theory 
of probability. The next chapter will be devoted to the dis¬ 
cussion of the fundamental steps to be taken along this line. 


CHAPTER V 


APPLICATION OF THE 
CENTRAL LIMIT THEOREM 

19. Approximate expressions of structure functions. The 
most convenient formulation of the so called “central limit 
theorem” of the theory of probability, which gives the ap¬ 
proximate expression for the distribution law governing the 
sum of a large number of mutually independent random 
quantities, can be given in the following form: 

Consider a sequence of mutually independent random quantities 
with probability densities Ui(x), U 2 (x), ••• , and characteristic 
functions ^,(0, Qiit), • * • , so that 

9 k{t) = j e**^Ut(x) dx (A: = 1, 2, • ■ •)■ 

Let 

J ocUk(x) dx — Uk , 

j (x — dx = 6* , 

X — at \^Ut{x) dx = Ct , 

y* (x — ak)*Ut(x) dx = dt , 

J I X — a* |V(x) dx = €t 

and assume that the given distribution laws are subject to the 
following conditions: 




84 



85 


(1) The Junctions Uk(x) are differentiable and there exists a 
constant L such that 

J I u^(x) I dx < L (fc = 1, 2, ■ • -)- 

(2) There exist positive constants a and (a < )3) such that: 

a < bjfc < Cfc < /3, d* < ^, e* < /3 (fc = 1, 2, • • •)• 

(3) There exist positive constants X and r such that in the region 

\t \ < r\ 

\ Qkit) 1 > ^ (fc = 1, 2, • • •)• 

(4) For each interval (c, , c^) (ci Ca > 0) there exists a number 

P ~ p(ci , Ca) < 1 such that for any t in the interval (Ci , Ca): 

1 gM 1 < p (fc = 1, 2, • • •)• 

Let us put o* , 6* and write UJjc) 

as the probability density of the sum of the first n random quan¬ 
tities. Then, for n —> co; 

„ , ^ 1 r (a: - AnYl 

= (2^ L “ J 




< 2 log* n 


The proof of this theorem is given in the appendix, together 
with a more exact formulation which, however, is not necessary 
for the purposes of the present chapter. 

As indicated at the end of the previous chapter, we must 
use the central limit theorem for estimating the conjugate 
distribution function U^'‘\x) of the given system G under the 
assumption that the system consists of a very large number of 
components Pi , pa » • * * » P» » with the structure functions 
«*>i(x), W 3 (x), • • • , the generating fimctions <pi(a), <pi(a). 


86 


■ , (pn{<x) and conjugate functions , 

Since the latter functions play the role here of the 
functions u^ix) in the formulation of the limit theorem, we 
must initially make sure that the conjugate functions for the 
actual physical systems satisfy the conditions assumed in the 
proof of the limit theorem. This, however, does not present 
any difficulty. 

The point is, that the conditions imposed on the functions 
Uk{x) in the limit theorem are equivalent to assuming the 
uniformity of one or another property which they describe. 
However, in statistical physics the separate components gn 
(molecules, atoms etc.) are always either of the same kind 
(homogeneous substance) or of a small number of different 
kinds (a mixture of several homogeneous substances). Thus, the 
structure functions and consequently also the conjugate func¬ 
tions for these components form a set within which all elements 
are either identical or break up into a small number of groups 
of identical elements. It is clear that under such conditions each 
characteristic of the functions ui°’(a:) appears uniformly in the 
entire set. 

Let us consider now the separate conditions prescribed by the 
limit theorem. The structure function a)*(a:), as well as its 
derivative, is usually an analytic function which does not in¬ 
crease faster than a certain power of x when x —»•<»; since: 

e-“Wx), 

the condition (1) is always satisfied. 

The functions ul‘'*(x) obviously always possess finite mo¬ 
ments of all orders, whereas the uniformity restrictions on these 
moments follow directly from the above general remarks; thus 
the condition (2) is also always satisfied. The situation with the 
conditions (3) and (4) pertaining to the characteristic functions 
is even simpler. In fact, the condition (3) demands that f?t(0 
does not become zero for sufficiently small this is a property 
common to any characteristic function. The condition (4) 
demands only that ^*(<) ^ 1 for i ^ 0; this also is a property 



87 


common to the characteristic functions of any continuous dis¬ 
tribution law. 

Thus we see that the use of the central limit theorem for 
the estimates of the conjugate distributions of a mechanical 
system gives definite answers. Introducing into the formula (34) 
(section 17 of the previous chapter) the approximate expression 
of C/‘“’(x) as given by the expression (39) of the present 
section, and taking into account the formulae (35) and (36) 
(section 17, chapter IV), we must let 



d log ^(g) 

da 



d^ log ^(ct) 
da^ 


and thus obtain: 



= 4>(a)e 


ax 


1 




exp 


X + 


d log <t>(a) 

da 


d log ^(g) 
da^ 


(40) 


+ 


0 


1 + U for I I + 


n 


3/2 


d log ^(tt) 

da 


< 2 log' n 


"(a 


for all X. 


We shall use this formula as a starting point in all future 
calculations. 

Let us now consider the choice of the parameter a which so 
far has been arbitrary. In all cases when we speak about some 
system G with the constant energy a and about the different 
components of this system we will choose for a the simple root 
(comp. 5, section 16, chapter IV) of the equation: 


d log ^(«) 

da 



88 


This value of the parameter a we will denote in the future 
by I?. We will also assume: 

log ^(a) \ ^ ^ 

V da^ / a - ^ 

Using these notations for the structure function of the 
basic system G, we obtain from the formula (40) for a = t? 
the following expression: 









for all X. 


for I X — a 


< 2 log“ n 

> 


/ 


In particular, for x == a this gives an important formula: 

(42) n(a) = + 0(rr’'^)]. 


This formula gives the approximate expression corresponding 
to the surface of constant energy 2^ in terms of the generating 
function 4>(a) and its second logarithmic derivative for a == d 
where ?? is determined by the basic relation 



d log 4>( 


da 


J a * 



20. A small component and its energy. Boltzmann's law. 
We have seen in section 15, Chapter IV that for the system 
G with the constant energy a which can be split into two 
components Gi and G 2 , the distribution law of the separate 
component Gi in its phase space has the probability density 

lX(a - El) 

m ' 


( 43 ) 



89 


where is the structure function of the component G 2 and 
El is that function of the dynamic coordinates of the com¬ 
ponent Gi which expresses its energy at the appropriate point 
of its phase space. Let us now assume, as is usually the case in 
the statistical mechanics, that the system G is composed of a 
very large number n of separate components which we will 
call, for brevity, molecules. We will assume that these molecules 
are not very different in respect to their structure, so that we 
will be able to use the approximate formulae derived in the 
previous section; as already indicated, the necessary conditions 
are always satisfied in actual physical problems. Let the sys¬ 
tems Gi and G 2 consist of n, and 712 (wi -f- n 2 = n) molecules. 
Let us also assume that the molecules forming the component 
Gx possess the structure functions: aji(x), , Wn,(x), the 
generating functions ‘ ‘ ®'nd the conjugate 

functions u[^\x), • • • , where t? is determined as the 

simple root of the equation: 


d log ^(«) 

da 



Writing o* and bu for the mathematical expectation and the 
dispersion of the conjugate distribution ui ^{x), we have be¬ 
cause of the formulae (35) and (36) (section 17, Chapter IV) 



which we will write in the shorter form: 


a* = 


d log ifik k — log •Pk 


Since, because of the assumed enumeration of the molecules 


4-w = n v-.w. = n ♦.(<>) = n 




*•1 



90 


we have 



d log 


±a., 

k=l 


Ai — ~ 


d log <l>i 
dd 


"I 


“ Zrf > A 2 — 


k~ \ 


d log 4>: 


= E a- 


k^fii ^ 1 


A I + A 2 — fl, 


and also 


d^ log ^ 


n 


B = = E 6. , 


ik- I 


B, = 


log 


= Z b., = E f>. . 


^-1 


* • n I ♦ I 


Bi -\~ B 2 = B. 

From this relation it follows that the quantities a and B 
must be considered as infinitely large quantities of order n. 

Let us assume now that the component Gi represents a 
negligible small part of the entire system G, i.e. that «, is 
negligibly small in comparison with n (in particular, we can 
assume rii = 1, taking one single molecule as the component 
Gi). In our asymptotic formulae this condition is expressed by: 

Til = 0(ti) (n —>00). 

Since we have agreed to consider all quantities a* as well as 
all quantities 6* as being of the same order of magnitude we 
can conclude from the above given group of formulae that: 

^1 = 0(a), Si = 0(B) (n^co), 

and consequently 

A 2 B 2 B (n —►<»). 

Keeping this in mind let us apply the approximate formulae 



91 


of the previous section to the expression (43). Because of the 
formula (42) section 19: 

(44) fi(a) = |1 + 0(1)1- 


To get the expression for — Ei) we will use the formula 
(41) of the section 19. Obviously we must write (a - E^) 
instead of x, '^ 2 (t?) instead of 4*(t>), B 2 instead of B, and A 2 
instead of a. Since ^2 = a “ ^1 , the difference x - a will be 
substituted by A, — £1 . In the remaining terms we can keep 
n (rather than substituting 712 for it) because of the relation 
Ui ^ n. Thus we find 


n,(a - £?,) = 



where = 0(a) = 0(n). If we will consider only such values 
Ei for which Ei — Ai = 0(n*^*) the bracket on the right side 
of the above formula becomes: 


( 5 :^ 1 .+ 0 ( 1 ) 1 , 


and we will have: 

, n,(o - £.) = 1 1 + 0(1) 1 - 

Comparing this with the formula (44), and noticing that 
B2 B we obtain: 

Thus we obtain for the distribution law of a small component, 
in its corresponding phase space, a very simple asymptotic 
formula (Boltzmann^s law). The most important feature of this 
w law is its exponential dependence on the energy of the small 
component in question, and the important role of the parameter 



92 


t? suggests that this parameter must have a simple physical 
interpretation. 

Considering the energy of our small component, we can 
write for its probability density (according to the section 15, 
Chapter IV) 

Qi(x)Q 2 (a — x) 

n(a) 

According to formula (45) we can put \ x — Ai\ = 0(n*''*) and 
write the above expression in the form: 





(1 + 0 ( 1 ) 1 . 


It must be noted that we have obtained for the approximate 
expression of the energy distribution of a small component the 
exact conjugate function of this component 





It is, of course, important that the parameter a assumes the 
value i.e. satisfies the equation: 


+ <z — 0. 

Thus, we see that the conjugate distribution law of a small 
component (in particular of a single molecule), taken for a = *? 
permits a simple physical interpretation of the energy distn- 
bution of this component. 

When (?i is a single molecule Ai = Ci remains constant for 
increasing n, and the formula (46) applies uniformly when x 
varies within arbitrary constant limits. 

The probability that the molecule will have an energy be¬ 
tween gi and g 2 is given (for the t-th molecule) by the formula: 


d log ^ 

da 



93 


Hence the mathematical expectation of the number of mole¬ 
cules with the energy between and g 2 is given by: 


^ r“* o)iix)e 


dx -h 0 (n) 


and in the case when all molecules have identical structure 
(structure-function a)(x)) we write: 




21, Mean values of the sum functions. In the present 
section we will consider the small component Gi as being a 
separate molecule; since the enumeration of molecules is im¬ 
material we will write a)i(x) and for corresponding 

structure and generating functions of the selected molecule Qx . 

Each phase function /(X| , X 2 , • • •)» depending only on dy¬ 
namical coordinates of the molecule g^ , can be interpreted as 
a function /(P) of the point P in the phase space 71 of this 
molecule. Since the set of dynamic coordinates of the molecule 
gx has the probability density 

- ex) 

fi(a) 

where 12 ’”(x) is the structure function of the complementary 
system G — gx y and ex is the energy of the molecule , the 
mean value of the function / is given by: 



— Cl) 

n(a) 



where dvx stands for the volume element in the phase space 71 
of the molecule gx , and it is assumed that the above integral 
converges absolutely. 

Using the asymptotic formulae derived above, we can obtain 
the approximate expression of this integral, and estimate the 
corresponding error. To do this we break up the space 7 , into 
two parts: y[ being the set of points in the space 71 for which 



94 


tx < log* (n), and yi^ being the set of all other points. We 
put: 

— Bi) 


L 

L 


m 


Q{a) 


dv, = r, 


m 


dv, = I", 


f2(a) 


so that 


/ = r + 

In order to guarantee the convergence of our integral, we 
will assume that, for large energy values Bx , the absolute value 
of /(P) increases not faster than a certain power of this energy; 
i.e. that: 

S(P) = 0(6?) (6,^ CO). 

Next, let us evaluate the integral I”. Putting, as usual, 

d log ifx log <Pi I. 

“ 

and writing 4>‘*’(a) for the generating fimction of the system 
G -* , we can write (according to the formula (41) section 19): 


- Bx) = 


(l>/ q\ {«-•») 


r (e. - 

, Jl) 

[ 2 ,r(B - b.)]*"’’ ^ Vn/J 






1 


[27r(B - bx)] 


i/a 



from which follows, for sufficiently large n, that 

n‘”(a - e.) < 

On the other hand the formula (42) section 19 gives us, for 
sxxfficiently large n, 

n(a) > i 



95 


which leads to: 




— Cl) 4£_ 

n(o) -PlW 


and consequently 


1 s /,,, I««1 'Isf *■ • 

where C is a positive constant. Because of the general formula 
(18) section 8, Chapter II 

A 09 

f ete"’’*' dvi = I x^ui(x)e~^^ dx, 


we obtain 


U" I < f” dx 

Jloo*n 



Since finally, the last integral in the above expression tends to 
zero for n —, we obtain, for sufficiently large n: 



1 I" 1 < exp 



Let us now evaluate the integral I'. Because of the formula 
(41) section 19, we have, for ei < Oi + log* (n): 

- e.) 




<-.., 1 ®’“’ [ 2 (B -“b.)] 


12^(B - 6 .)] 


1/2 


+ oi 


1 + I — 


n 


3/2 


OjJ^ 


= + o(l + i 5 ^)}^ 


175 


96 


whereas the formula (42) section 19 


fi(a) = 



It follows 


- €,) e""'* / 


Q{a) 

and consequently 




t 


1 + 0 


1 -f- (ci “ a,) 


n 



r = f f(P) 




<PiW 


1 + 01 


1 + (fii - gi) 

n 



dvx 




«« Si *" + '’(»)■ 


The integration over the space y[ can be extended over 7 i 
without difficulty since as we have seen above 


/ 

•'7"i 






dv 


■ - »(;)• 


Thus taking into account (47), we obtain; 


(48) 


/ = / m 

^ 7 I 






dv 


■+»© 


In particular, when /(P) = xC^i) is a function of the energy of 
the selected molecule we obtain, because of the formula (19) 
section 8, Chapter II, 


= / X(x) 






wt(a;) dx + 


"S) 


(49) 


= J x(a:)w!'^\ac) dx + 


Thus, in particular. 


ei = J anij'^*(a:) dx + = a* -f* oQj, 


(50) 



and 


(e. - a,)’ = / (* - + 00 = *'■ + 00’ 

etc. This formula emphasizes the role of the conjugate function 
as the approximate distribution law for the energy of 

a molecule. 


Most of the phase functions which we encounter in statistical 
mechanics have a very special form. They are almost always 
the sums of functions each depending only on the dynamic co¬ 
ordinates of only one molecule. Such phase fimctions we will 
call sum functions. Thus if a system G is formed by the mo e- 
cules Qi t Qj j ■ • • , corresponding to the phase spaces 71 » 
7 a, * • • t 7 ii j the sum function can be wntten as: 


KP) = E 

i^l 

where Pi is some point in the space 7 i(t = 1 | 2, • ■ • * w). Since 
the mathematical expectation of a sum is always equal to t e 
sum of the mathematical expectations of its terms, we o tarn 
(using formula (48)) the following approximate expression for 
the phase average for such a sum fvmction: 



Z 7 < 


iml 


11 MPO 

dml J 






: dvi + 0 ( 1 ) 


(it goes without saying that the functions fi(Pi) must satisfy 
the general assumptions used in the derivation of the onn a 

(48)). 

Example 1 . The number of molecules with energy within certain 
limits. lit 0<a</3<+w and: 


fiiPi) 


1 , 


if a < ^ 


0 in all other cases. 


98 


The sum function 

siP) = E fi{P<) 

«-l 

represents, apparently, the number of the molecules of the 
given system with the energy between a and /3. According to 
(49): 

^ = 7 = E f' <-<(*) ^ + 0(1) 

iml J a <Pi\y) 

= E r dx + 0(1). 

»• 1 •'a 

In particular when all molecules are identical we have: 

— = f” u'^\x) dx + o(i). 

n J a 'W/ 

Example 2. The energy of a large component. Let Gi be the 
component of the system G consisting of the molecules g\ , 
92 , • • * i 9ni • Let El be the energy of this component and 

(1 < I < Wi) 

= \ 

0 (ni < i < n). 

It is apparent that 

SAP,) 

«•! 

HO that formula (50) gives us: 

(51) Ex = E + 0(1) = - i f ^ + 0(1). 

Let us note here that we cannot use the same method for the 
evaluation of the dispersion of Ei , since the energies of different 
molecules forming the component Gi are not independent so 
that the dispersion of their sum is different from the sum of 



99 


their dispersions. This considerably more complicated question 
will be discussed later (Chapter VIII). 


22. Energy distribution law of a large component. The 
derivation of the asymptotic distribution law for the sum func¬ 
tion considered as a random quantity is a rather difficult 
problem and will be considered in detail in one of the following 
chapters (Chapter VIII). However, in the most important case 
of the function E, considered in the previous section, the 

problem can be solved comparatively simply. 

Let Gz be the component complementary to Gi , and let 
Ez be its energy. We also put = n - n, and consider n, 
and Uz as being infinitely large quantities of the order n. In 
general we will use the indices 1 and 2 to denote the quantities 
pertaining to Gj and Gz ; in particular we put: 






According to the formula (27) section 15, Chapter IV the 
probability density of Ei is given by 

SIi(x)S)2(a — x:) 

m 


According to (41) and (42) of section 19: 

f [ 

lexpL- 


«.(*) 




(2t5i) 


2B 

1/2 


exp 


fla(o — x) = 4>2(t>)e 


d(a-x) 



a.ol 


(x — A 

2 B 2 



100 


Writing, for brevity, = B* and remembering that, 

because of the law of composition for the generating functions 

we get: 

fli(a:)Q2(a — j) 

m 



1 

{2irB*y'^ 


exp 


{x - A,r _ (x - Arf X 
2B, 2 B 2 ) 




(x - Ad 
2B* 


+ 0 


©• 


Thus, forn —+ CD, the distribution of is given by the Gauss 
distribution function with the maximum at and the dis¬ 
persion 

R* — ^1^3 

^ ~ B ' 

If the energies of the molecules composing the component 
Gx were mutually independent the dispersion of their sum 
would be 

i^\ 

Thus we see that the true dispersion < Bj , which is 
quite understandable. In fact, since the sum of energies of all 
n molecules is fixed, the energies of individual molecules are 
correlated negatively, so that the dispersion of their sum is 
smaller than the sum of their dispersions. 


23. Example of monatomic ideal gas. As an example of 
the application of the above discussed general methods we will 
consider now the simplest statistical S 3 ^em; a monatomic 
ideal gas. This served as the first example in the development 
of the basic ideas of statistical mechanics. Under the name 
‘‘ideal monatomic gas” we will imagine a system G whose 
molecules Qx t g% t • * * *9* simply material points. As usual. 



101 


the total energy of the system is the sum of the energies of the 
individual molecules so that the molecules must not possess 
any mutual potential energy; as we have seen in section 8, 
Chapter II, this condition, which is unavoidable for the ap¬ 
plicability of our methods, is actually never fulfilled in reality 
so that we have to consider it as only an approximation. We 
will assume that our gas (system G) is contained in a vessel 
with the finite volume this condition will be expressed 
formally by the introduction of a special term Ui{Xi , yi , 2 ,) 
representing the potential of the walls into the expression for 
the energy e,- of the molecule Qi (with the coordinates Xi, y^ f 2 ,). 
Since we assume that the system G is not subject to any out¬ 
side forces we can write 


I 



^ (Xi 4- + 2i) 4- Ui{xi , yi , Zi), 


• • • 

where m,- is a mass of the molecule Qi , whereas , Zi are the 
components of its velocity. We will assume that the forces 
between the walls of the vessel and the molecules are different 
from zero only at very small distances from the wall. If we 
require that not a single molecule, no matter how fast it is 
moving, can penetrate through the walls of the vessel, we 
must assume that Ui is infinitely large outside the vessel. 
Inside the vessel we can assume I/, to be an arbitrary constant 
putting for simplicity Ui = 0. Of course, such a description 
of the function Ui {Ui = 0 inside, Ui ^ + ^ outside) is only 
an approximate one; it would be more correct to assume that 
the function Ui is continuous and increases very rapidly when 
the molecule approaches the wall. However, we will use this 
idealized concept of Ui since it considerably simplifies the 
calculations without essentially affecting the results. 

The Hamiltonian dynamic coordinates of the molecule Qi 
are represented by its three Cartesian coordinates and three 
components of its momentum. 



dCi 

Pi — — ^iXi , g* — Trhyi t Ti — VfliZi . 


102 


Therefore, 

ei = ^ (p! + 9! + + Uiixi , yi , Zi), 

and consequently 


1 


E — 2 ~ (p? “h d" T*?) + S ^»(^» j 2/» > 


t -1 


frt 2mi 




where E is the total energy of the system G. 

For the function V{x) which expresses the part of the space 
r where E < x we have the expression: 

V(x) = f dV = f n dx. dyi dz^ dpi dqt drt . 

J E<x J B<x i^l 


Since outside of the vessel the potential energy, and conse¬ 
quently the total energy E, of the system G is infinitely large, 
the integration is carried out only inside the vessel. Since, on 
the other hand, the potential energy inside is equal to zero, 
we have: 


V(x) ^ J 


1 


S 2m. 


n dp, dqi dr, 


(p! + 9! H- r?) < X 




The above integral represents the volume of an ellipsoid in the 
n-dimensional space with the semi-axes (2m<x)*^* (t = 

2, • • • , n). This gives us 



r[(3n/2) + 1] 



Thus the structure function of the system G can be written as 


f2(x) =* 


dV(x) 

dx 


(53) 


= V 


(2ir) 


3ii/2 


r[(3n/2) + 1] 


(n ”“■} 


StI l3m/2)-i 
2 ^ 


In this elementary example the expression for the structure 
function is so simple that one does not require an approximate 



103 


formula. However, for purposes of illustration we will con¬ 
struct the asymptotic formula for 12(x) and will compare it 
with the exact expression (53). For the generating function 
of the system G we have: 


4>(t>) = J Sl(x)e~^' 



= y" 


(27r) 


3n/2 


r[(3n/2) + 1] 


n mr 

l-l J 


I 3^ I" ^<3n/2>-.g-0. 


(54) 


3/2 ko-3n/2 


= V\2irY^'H n 




If X is the total energy of the system G, the quantity 0 is dc 
termined as the root of the equation: 


d log ^ 3n 




2 t? 


X, 


thlLS, 

(55) 



3n 

2 x’ 


and consequently: 


4)(d) = V'*(27r) 


3n/2. 


n 


-3n/2 


X 


3n/2 


i^l 


From the formula (42) section 19, where 

d^ log 4* _ 3n _ 


B 


3n' 


we obtain the asymptotic expression: 
«(.x) ^ - 


t27r(d* log 4»/di>")] 


2\M/2 


( 56 ) 


= V' 


( 2 ») 


«»3n/2 


[(371/2)]’"'• (3n/2)]''’ 




3/2I 3 w (8«/a)-i 

/ ^ X 


t -1 


J 2 


104 


Comparing the approximate expression (56) with the exact ex¬ 
pression (53) we see that in our single case the method leads 
to the substitution of the quantity r[(3V2) -f 1] in the formula 
(53) by its asymptotic expression given by Sterling formula. 

We will be satisfied for the time being with considering only 
this simplest system as an illustration of oui' general method; 
the complete theory of the monatomic ideal gas will be dis¬ 
cussed in the next chapter. 


24. The theorem of equipartition of energy. We have seen 
(section 20) that the conjugate distribution function 



for a small component (in particular for a molecule) of a given 
system G represents an approximate expression of the energy 
distribution of this component. In the case of the monatomic 
ideal gas we have 




r(3/2) 



27rF(2m.)'"'a:’'^ 


(this formula could be derived in a similar way as the formulae 
(53) and (54) of the section 23 or can be obtained directly as 
the special case of these formulae for n = 1). Thus: 

Ui ^ \/2 V X € • 

IT 

In particular, for the mean value of the energy' of the mole¬ 
cule Qi we have an approximate expression: 

/--X - ^ f — d log (pi _3^ 

(57) e,- - j (.x) dx - 2 ^. 

In the case of an ideal monatomic gas we have considered all 
molecules as possessing identical structure (i.e. the identical 
expression for energy in terms of the dynamic coordinates), 



105 




although their masses could be different. We see now that the 
mean value of energy as well as its distribution law is the same 
for all molecules independent of their masses. Thus, in the 
mixture of two gases—a heavy one and a light one—the mean 
energy of a molecule is the same for both components. Further¬ 
more this mean energy value and its distribution is independent 
of the volume of the vessel, being a universal function of the 
parameter t?. This result, pertaining to the mean energy of a 
molecule, represents a special case of a general theorem of 
statistical mechanics known as "the law of equipartition of 
energy among the degrees of freedom". The importance of this 
theorem lies in the fact that in many cases it permits us to 
find the mean energy of one or the other component of the 
system without almost any calculation. We will now give the 
general proof of this theorem, omitting however some possible 

generalizatioas. 

Consider a system G whose component G, possesses I de¬ 
grees of freedom and is described by the Hamiltonian variables 

t '' ‘ t Qt y P\ t ‘' t Vt • ^ assume that the total energy 

Ei of the system G* is its kinetic energy. This can be written 
generally as a quadratic form in , pa , • • * y Vt y with co¬ 
efficients which may depend on , 52 , • ■ * » 9i • Let us denote 
it by Hiqt , p*)- Writing, as usual, Vi(x) for the volume of that 
part of the phase space occupied by the component Gi where 

Ei < Xf we have 


!. 


dqi • • • dqt dpi • • ■ dp. 


.VkX* 


= f dqi • *' dqt [ dpi ■ • ■ dp, , 

J .Pk) <* 

where the inner integral for the fixed values of 91 , 92 » '' * j 
9 i , is evaluated within an ellipsoid in the ^-dimensional space 
representing its volume. It is clear that this volume is pro¬ 
portional to and that the coefficient depends on 91 , 92 » 

• • • , 9 i . Thus: 


/ 


> Qi) 


t/7 



106 


where Ci (as well as C 2 and c., which will be introduced later) is 
a positive constant. We get from this 



C2X 




(x > 0), 



X e 



and consequently 




d log _^ 

dd ~ 2t?‘ 


This relation expresses the theorem which we intended to prove, 
in fact it shows that the mean energy of a component of the 
given system is proportional to the corresponding number of 
degrees of freedom, the coefficient of proportionality being the 
quantity l/2t?. In particular, since for the molecule of mon¬ 
atomic ideal gas t = 3, the formula (57) derived earlier repre¬ 
sents a particular case of the general formula (58). 

It is interesting to notice that the theorem of equipartition 
of energy which was proved using the approximate expression 
for the mean energy actually holds for the exact expression (of 
course in this case we get a somew^hat different coefficient of 
proportionality; the quantity t? arose from our approximate 
analysis and does not exist at all in the exact theory). In order 
to prove this we must notice that the Hamiltonian function 
H(g, , Pk) of the selected component, being a quadratic form 
in the variables p*, must satisfy the Euler relation 


and therefore 



dp*' 




L- [ ^ 1 ^ f 

fi(a) A. grad E 2{l(a) ^ A, 


2Q(a) jk«i Jx, 


Pk 


dE dS 


dH dZ 
dpt grad E 


dpk grad E ’ 



107 


since in the expression of the total energy R the variables p* 
only enter in H. Thus: 




dVk 


0- <k< i). 


For each of the surface integrals on the right hand side of 
(59) a volume integral can be substituted according to the 
Green's theorem. Since (1/grad E) dE/dpk is the cosine of the 
angle between the outward normal to the surface and the 
Pt-axis, we have 



^ dE dX 
dpt grad E 



where F{qi , p*) is an arbitrary function of the dynamic 
variables of the system G. In particular: 


f ^ = V{a) 

Jz, ^ dpfc grad E Jv, ^p* 

i.e. each such integral is equal to the volume of that part of 
the phase space of the system G where E < a. Therefore the 
formula (59) gives us: 




tv (a) 
2n(ay 


proving the above statement. 

Comparing the exact formula (60) with the formula (58) 
which (according to (50) section 21) is exact up to terms of 
order 1/n we obtain 




d log V 
da 



this gives us the approximate expression of the parameter d in 
terms of V(a) i.e. the approximate solution of the equation 


d log 

which determines the quantity d. The formula (61) plays an 
important role in some theoretical studies. 



108 


Example. Rotational energy of a molecule in diatomic ideal gas. 
We will imagine a molecule of a diatomic gas as a pair of 
material points connected by a rigid massless axis of infinitely 
small length. The position of such a system in space is de¬ 
termined by five parameters for which we will choose three 
Cartesian coordinates x, y, z of one of the points and two 
angular coordinates fp and yp determining the direction of the 
axis. Writing p* , , p^ for the corresponding mo¬ 

menta, m for the mass of the molecule, and A for the moment 
of inertia of the second point with respect to a center of rota¬ 
tion at the first we can express the total kinetic energy of the 
molecule as the sum of the translational energy: 

^ (p* + pj + p 2 ) 


and the rotational energy: 


(62) 


" ~ 2A 


+ 



sm 


(we shall see however that a knowledge of this expression is 
unimportant for the solution of our problem). 

The molecule in question represents a component of the gas 
(whose other components may have, however, an entirely 
different structure). On the basis of a general definition of the 
component section 8 , Chapter II we can consider each of the 
two sets of dynamical variables (x, y, z, y p.) and 

(v’t ’A, Pp » vf) as an individual component of our gas; these 
two components can be considered as the fictitious “carriers 
of the energies and Cr corresponding to three and two degrees 
of freedom. The determination of the mean values of any of 
these components can be done without calculation by using the 
theorem of equipartition of energy. This theorem leads im¬ 
mediately to our previous formula for the translational energy, 
whereas for the rotational energy it gives 

-_o 1 d log Via) d log Via) _ 1 

2 da da 



109 


Thus we see that the theorem of equipartition of energy 
among the degrees of freedom saves us many difficult calcula¬ 
tions which we would have to carry out in the case of a more 
complicated statistical system. 

For the sake of completeness let us find the approximate 
expression for the distribution of e, . Since: 

t'rW = f dtp dpf dp^ 


^2ir 

d^p / dip 
0 Jo 


(here the inner integral represents the area of an ellipse with 
the semi-axes and sin v>) we find that the 

structure function of the “fictitious carrier’^ of the energy e, is 

ujrix) = Vr{x) = Stt^A 


II 


pI 


+ 


pi 


dp^ dp^ = Sir^Ax 


• 2 

Sin (p 


< 2Ax 


so that the generating function is 

.ON o >. r* j SttM 

vJrW = Sir A j e dx = —^—. 

This gives us, by the way, the familiar result: 

-_ d log v?r(t?) ^ 1 

~ dd I?* 


The conjugate function u^^x) is determined by the ex¬ 
pression 



which is the approximate expression for the probability density 
of the quantity e,. From the above it follows that 


P(gi < e, < Qi) - ^ 






110 


Thus, the rotational energy of a diatomic molecule is subject 
to an exponential distribution depending exclusively on the 
parameter 

25. A system in thermal equilibrium. Canonical distribu¬ 
tion of Gibbs. In the previous discussions we have been con¬ 
sidering G as an isolated system which is not subject to energy 
exchange with the surrounding medium; thus the total energy 
E was always considered as a constant. It is clear that in actual 
physical systems such an assumption can be only approxi¬ 
mately true since any actual physical system, no matter how 
well isolated, nevertheless imdergoes some kind of energy in¬ 
teraction with its surroundings. 

Another possible idealization consists in considering the sys¬ 
tem G as being a component of another much larger system 
G*. In this case the component can freely exchange energy 
with its surroundings, i.e., with the other parts of the system 
G*, and the energy E of the system G must be considered as a 
random quantity varying with time. Its distribution law can 
be derived on the basis of the general formula which we have 
proved earlier. It is clear that the question as to which of the 
two idealizations is closer to reality must be decided on the 
basis of purely physical considerations in each particular case. 

We have seen above, Chapter IV, section 13, that in the 
first case (completely isolated system) the fundamental dis¬ 
tribution law for the system G can be obtained as follows: the 
point P in the phase space T of the system G, representing 
the state of this system is always located on the surface S. 
and is distributed with the surface density: 


^ ^ fi(a) grad E ' 

In the second case (freely interacting system) we arrive at 
an entirely different fimdamentai law. Since G is now a small 
component of the system G*, the point P is no longer bound 
to any surface of constant energy, but can move freely in the 
space F; according to the previously derived distribution law 



ni 


for the small component (section 20), the point P is distributed 
in the space T according to the probability density whose 
approximate form is 



1 


e 


(the quantity marked by asterisks pertains to the system 
G*t so that the quantity t?* is a root of the equation 
— [d log ^*{a)]/da = E*). 

This latter idealized picture is usually referred to as a system 
in thermal equilibrium. A uniform constant temperature in 
thiR case is due to the free interaction between the system G 
^ and its surroundings. 

According to Gibbs the fundamental distribution law (63) 
corresponding to the first idealized picture is called a micro- 
canonical distribution, whereas the law (64) corresponding to 
the second idealized picture is known as canonical distribution. 

The fundamental difference between these two distribution 
laws lies in the fact that whereas (63) gives the distribution 
on the surface 2, , (64) establishes the distribution in the 

entire phase space T. 

Let us now consider the canonical distribution (64) in greater 
detail. As we know (section 20), the distribution of energy E 
in the system G (considered as a small component of the 
^ system G*) is given by the probability density 

4>(i?*) 


so that: 


_ ^ / xme-^”dx ^ ^ I' d log j,(a) \ 

, This shows that the parameter plays the same role in the 
• second picture as the parameter did in the first one, provided 
that instead of the fixed value .E = a we introduce in the 



112 


canonical distribution the quantity E representing the mathe¬ 
matical expectation of E. 

Assume now that the system G consists of two components 
Gx and Gj with the energies Ex and E 2 . In order to obtain the 
distribution law of the system G, in its phase space Ti , we 
must integrate the expression (64) over the entire phase space 
r 2 . Since E == Ex E 2 and and also 

(according to (32) section 16): 

f dV2 = 4>2(t?*) 

•'r, 

this integration yields 

dV, = - - / c-"**' dV, = - - 

Jr. $(,?*) $(,?*) Jr. 4>(^*) 

(65) 

which represents the probability density governing the distri- 
tion of the component Gx in its phase space. Thus, we see that 
any component of the canonically distributed system is also 
distributed canonically with the same value of the parameter t?*. 

We know that the formula (65) holds also for small com¬ 
ponents in the microcanonical distribution as represented by 
the fundamental law (63). On the other hand, for the large 
components the formula (65) does not hold since it leads to the 
energy distribution law 

S2x(Ex)e~^-^' 

which differs quite essentially from the formula (52) section 
22 for the distribution of the large component of the micro- 
canonical system. 

The relation: 

^(1?*) ~ ^1(1?*) ’ 02(1?*) 


shows us that in the case of the canonical distribution for a 
S 3 ^tem consisting of several components the distribution laws 
for these components must be combined as if they were mutu¬ 
ally independent groups of random quantities. On the other 
hand, in the case of a microcanonical distribution the various 
components of a given system are considered as mutually 
dependent due to the constancy of the total energy. That is 
why, in spite of the fact that for small components the distri¬ 
bution laws based on (63) and (64) are almost identical, such 
is not the case for the large components. 

It is clear that the computations based on the law (64) are 
considerably simpler than those based on the law (63), since 
it is easier to operate with independent random quantities. 
Therefore, we must ask ourselves: to what extent can we use, 
as an approximation, the fundamental law (64) instead of (63) 
for the calculation of the mean values of the phase functions 

in the microcanonical distributions? 

We have already mentioned that the majority of phase 
functions which one encounters in statistical mechanics are 
sum functions; i.e., they can be written in the form of a sum 
of terms each depending on the coordinates of one molecule 
only. Because of the above mentioned similarity in the distri¬ 
bution laws of small components, the mean value of each term 
can be found approximately using the formulae belonging to 
the canonical distribution (this was a basic idea in our method 
of approximation). But the mean value of the sum is always 
equal to the sum of the mean values for dependent as well as 
for independent quantities; thus in calculation of the mean 
values of the sum functions we can, as an approximation, use 
the canonical (64) instead of microcanonical distribution (63). 
As mentioned above this change constitutes our method of 
approximation. 

However, if we are looking for the mean value of a function 
which is not a sum function (but, for example, the square of 
a sum function), the substitution of (63) by (64) would lead, 
generally speaking, to a completely erroneous result; the mean 
value of such a function in the microcanonical distribution is 


114 


entirely different from its mean value in the canonical distri¬ 
bution. The trivial example of that kind is given by the dis¬ 
persion of the total energy of the given system; this quantity, 
which is apparently equal to zero in the microcanonical dis¬ 
tribution, has a definite positive value in the canonical dis¬ 


tribution. In the Chapter VIII we will see further examples 
of this kind. 


The relation between the distribution laws (63) and (64) is 
unfortunately not clarified sufficiently in the existing texts on 
statistical mechanics. Thus, one often says that one can base 
the statistical mechanics on either the ergodic theory leading 
to the formula (63) or the “hypothesis of canonical distribu¬ 
tion i.e. the distribution law (64) which is introduced as a 
postulate, and that in the final count both points of view lead 
to the same results. 


We see, however, what is the actual ‘situation. The basic 
laws (63) and (64) correspond to two entirely different idealized 
pictures. Looking for the rational foundation, instead of re¬ 
ferring only to the practical successes we will see that the 
ergodic theory, or some kind of its equivalent, is necessary in 
both cases, since the law (64) for a system in thermal equili¬ 
brium can be based only on the law (63) for the isolated system. 
Finally the statement that both theories lead to the same 
result is coiTect only within certain fimits (for mean values of 
the sum functions); beyond these limits the introduction of 
quantities taken from one of these idealized pictures into the 
other would lead to very serious mistakes. 


X 


CHAPTER VI 


f 


IDEAL MONATOMIC GAS 


26. Velocity distribution. MaxwelVs law. In the present 
chapter we will use the fundamental formulae of the theory 
of the ideal monatomic gas as derived in sections 23, 24 of 
the previous chapter and will employ the same notation for the 
quantities involved. Let us choose one of the molecules of such 
a gas with mass , and consider one of its velocity com¬ 
ponents, for example, Xi = (l/mi)p,, where is the component 

of the momentum of the molecule in the x-direction. Our 

• _ 

problem is to find the distribution of the quantity x.- . For 
this purpose we introduce the function: /(x.- , y, , 2. , p** , 
Vti , p.<) which is defined in six-dimentional space 7. by the 
relations 






< 2, 


[0 otherwise 


where z is an arbitrary real number. It is clear that the mean 
value of the function / represents the probability of the in- 
^ equality x^ < or, in other words, gives us the distribution 
law of the quantity Xi . 

But, according to the formula (48) section 21, chapter V: 



? - «« Si * + 


On the other hand, we have (section 23, chapter V): 


(x] + Pi + 2v) + Ui(Xi , Vi , 2.) 

(67) 

As we know, the presence of the term f/, in the expression for 


115 


116 


Ci permits xis to calculate the integral on the right hand side 
of the formula (66) by integrating only within the limits of 
vessel containing the gas, however, within the vessel Ui = 0. 
Since, finally 

J 1 * • • 

dp,i = midxi , dp^t = niidyi , dp,i = rriidzi , 

we have: 

7 - F,; /// 

»<<• 

= j. jdx, 

9i<M 

/ [~ ^ v\\dyt j exp |^- ^ , 

or, since each of the two complete integrals is equal to 

^ = L ^ 

Thus we see that the quantity i< is approximately dis¬ 
tributed according to the Gauss law with the center at zero 
and with the dispersion 2/(m<t>). In particular, it means that 
the distribution law of velocity components (known as Max¬ 
well’s law) depends on the mass of the molecule. In the case 
of the gas mixture the velocity distribution, in contrast to 
the energy distribution, is different for different molecules; the 
heavy molecules have smaller dispersion than the light ones 
which physically means that the former move slower than the 
latter (this is quite natural since their kinetic energies must 
be the same). 


exp 


- ^ (ij + + is 

- dXi dpi dZi 


V(27rm,)®"*t> 


3 / 2 . 0 - 3/3 


27, The gas pressure. We know from physics the important 
role played by pressure in the theory of gases. It is impossible 


117 


to build the statistical theory of monatomic ideal gas without 
p first defining this physical notion in terms of our mechanical 
picture. 

Imagine within the vessel containing a given mass of gas a 
fixed thin plate with the area 5. During the given time in¬ 
terval (<, t + AO one side of this plate is subject, generally 
speaking, to a number of impacts from gas molecules. Each 
of these impacts communicates to the plate a certain impulse. 
The sum of the components of these impulses perpendicular to 
the plate determines the pressure of the gas on our plate, or 
to be more exact, on one of its sides. The mean value of this 
impulse referred to imit time and unit area is called the pres- 
^ sure (of a given point and in a given direction). 

To make this definition more exact let us choose on our 
plate an arbitrary point P and its arbitrary neighborhood As. 
The sum of impulse-components perpendicular to the plate 
which fall within the time interval (<, t -j- Ai) into the area 
As depends on the state of gas at the moment t and represents 
consequently a phase function of our system. The mean value 
of this phase function is a quantity depending on Ai and As 
only, so that dividing it by AiAs and letting the time interval 
A^ and the area As approach zero we obtain in the limit the 
quantity which we will call the gas pressure at the point P 
and in the given direction (as we shall see later, and as it is 
^ natural to expect on the basis of the symmetry considerations, 
the gas pressure of the point P does not depend on the direction 
chosen). 

Since we can choose the coordinate system arbitrarily, we 
can assume, without loss of generality, that the area As is 
perpendicular to the i-axis. Suppose a certain molecule has 
the velocity components x, y, z at the moment Where should 
this molecule be at this moment in order that it might strike 
the selected side of the plate As within the time interval 
{ty i + AO? It is apparent that it must be within a sloping 
^ cylinder with base As, height | x | A«. Furthermore, the axis 
of the cylinder must be parallel to the vector (x, y, z), and 
it must be constructed on that side of the plate As which is 


118 


subject to molecular impact. Let us assume, to be definite, 
that this side of the plate is facing the negative direction of 
x-axis; in this case apparently we must have i > 0 since 
otherwise impact is impossible. (We are neglecting here the 
possibility of collisions between the molecules under considera¬ 
tion during the time interval A^.) 

Let fi(x) be the structure function of the entire mass of gas, 
and be the structure function of the system which is 

obtained by removing the above selected molecules from the 
gas. We know (comp. 25) that for a total gas energy E the 
probability density for the selected molecule in its phase space 
is given by 

^ - c.) 

^ n{E) 

where the energy of the selected molecule c, is a function of 
its six dynamic coordinates as determined by the formula (67). 
This density q is also a function of the same kind. If the point 
(^i » Vi , Zi) is outside of the vessel containing the ga'^, 
Ui{Xi, Vi.Zi) becomes infinite and g’ = 0 since «"^(£? - ej = 0. 
Ii^ide of the vessel C/,(xi , y,- , Zi) = 0, and e* together with 
Q {E — Ci) are functions of the velocity coordinates x, , y.- , 

Zi only. The same is true for the density g. This function is 
determined more exactly by the formula (53) of chapter V, 
according to which the quantity Q(E) is given by the product 
of V and a constant (here z = E) whereas the quantity 
{E — Ci) (for the points within the vessel) is equal to the 
product of T'* * and a certain quantity depending on Xi , y. , 

Zi only. Therefore; 

where the form of the fimction x can be determined (this, 
however, is of no importance for us at the moment). 

The probability that the selected molecule will strike the 
area As within the given time interval M and with the velocity 
components in the interval x to x -j- di, y to y -b dy, and 


119 


• • « 

2 to 2 -h dz, can be obtained by integrating the quantity 

i x{x, y, z) dx dy dz dx dy dz 

over the entire volume of the above described cylinder, i.e. 
will be given by 

li*i 

^ \ X \ AtAsxiXf y, z) dx dy dz. 


In the process of the impact our molecule communicates to 
the area As the impulse w, | x ] in the direction of the x axis 
(mi being the mass of the molecule). Remembering that this 
‘ impulse is zero if at the moment t our molecule is located 
outside of the above described cylinder, we obtain for the 
mathematical expectation of the mean impulse communicated 
by the selected molecule to the area As in the direction Ox 
during the time interval A^ the expression 

(69) ^y 

where the integration is performed only for the positive values 
• • • 

of X (and all values of y and z). 

We have been speaking so far about the first phase of impact 
during which the velocity of the incident molecule drops from 
X to zero and it communicates its original impulse in the 
direction Ox to the plate. This will be followed by a second 
phase during which the plate communicates to the molecule 
the impulse in the opposite direction. We can calculate the 
mathematical expectation of this new impulse in the same way, 
with the only difference that the integral in the expression (69) 
must be taken along the negative axis Ox. 

In order to obtain the mathematical expectation of the total 
impulse communicated to the area As during the time Af in 
the direction Ox we must add both results, thus obtaining an 
expression of the type (69) in which the integration is extended 
for all coordinates within the limits ( — ®, + °o)- 



120 


The integral (69) can be evaluated approximately without 
detailed calculation, since the integral 

(70) ^ iff y, z) dx dy dz 

represents the mean value of kinetic energy of the selected 
molecule/ which, according to formula (57) chapter V, is equal 
to 3/2i?. Because of the s 3 Tnmetry in the three velocity com¬ 
ponents, the integral in (69) is equal to one third of the integral 
in (70), thus being equal to l/(m.t?). 

Consequently the mathematical expectation of the impulse 
becomes approximately 

Fd 


or 


Ft? 


per unit time per unit area. (In particular we notice that it 
does not depend on the mass of the selected molecule.) 

If the number of molecules is n, the mean value of the total 
impulse communicated to the plate (in the normal direction) 
per unit area per unit time is: 



which is usually known as the pressure of the gas. We see that 
the gas pressure does not depend on the location or the orienta¬ 
tion of the plate being only a function of the volume, the 


‘In fact, according to (68), this mean value is represented by 


f/<» 


2*2 • • • 
H" y z)q dx dy dz dx dy dz 


= ^ Jff (x* + 1/^-1- 3“)x dx dy dz. 


121 


number of molecules and the total energy of the gas (since t? is 
function of the total energy determined by the formula (55) 
chapter V). 

From (71) we obtain: 

pV ~ n/i? 

which coincides with the well known formula of Clopeyron if 
we put ^ — \/kT where k is the Boltzmann^s constant and T 
the absolute temperature. In the next section we will discuss 
the validity of such a physical interpretation, and will now 
remark only that the relation 

n/V = pd^ 

(obtainable from (71)), in which is considered as some uni¬ 
versal function of temperature, leads to the well known 
Avogadro law: for equal temperature and pressure all gases 
contain an equal number of molecules per unit volume. 


28. Physical interpretation of the parameter t> We have 
seen that the parameter i? plays a very important role in 
statistical systems; it enters in an essential manner into the 
expressions for any physical characteristics of the system. 

On the other hand the notion of the temperature which 
plays, as is well known, the fundamental role in thermo¬ 
dynamics has not found as yet any interpretation in our 
mechanical theory; in our attempts to build a purely mechanical 
theory of the heat processes we must try to find an interpreta¬ 
tion of this physical notion in terms of the notions used above. 

This comparison justifies an attempt to connect the physical 
meaning of the parameter t? with the temperature T of the 
system, although, of course, it does not justify postulating a 
universal functional dependence between and T. We have 
seen in the previous section that in the case of a monatomic 
ideal gas one can give definite arguments in favor of such a 
postulate, and that in this case one can arrive at the relation 



122 


where k is Boltzmann^s constant. However, even within the 
theory of the ideal monatomic gas it would be reasonable to 
explore the immediate consequences of the above postulate 
before accepting it. Thus we could substitute the expression 
(72) for in the distributions so far obtained (for example, 
in the section 26); we would see in this case that these dis¬ 
tributions coincide exactly with those derived in physics on 
the basis of entirely different considerations. 

Since the postulate (72) demands a careful check even in 
the simple theory of the ideal monatomic gas, it is clear that 
at the present moment we can tell hardly anything more 
definite about general dependence between t? and T in the 
case of more complicated physical systems. We will see, how¬ 
ever, in the next chapter that the interpretation of the funda¬ 
mental laws of thermodynamics on the basis of our general 
mechanical theory will lead quite naturally to a generalization 
of the postulate (72) for a very broad class of physical systems. 

In the present chapter we will give only one additional argu¬ 
ment in favor of a universal dependence between t? and the 
temperature of the system. 

Suppose we have two physical systems with temperatures 
Ti and T 2 which are characterized by the values i?i and t ?2 of 
our statistical parameter. 

If we imite the two systems (Ti , t9i) and {T 2 , ^ 2 ) into one 
system (T, t?) (i.e. if we assume that these two systems are 
interacting with one another), the finite temperature T of the 
composite system will lie between Ti and T 2 ; in particular in 
the case Tx is equal to T 2 we will have T = Tx = . If we 

assume that is a monotonic function of temperature its value 
must be intermediate between and for the composite 
system. It is easy to see that t? actually possesses this property. 

In fact let ^»i(a), $2 (a) and #(a) be the generating func¬ 
tions of the two components and the composite system, and 
Ex , E 2 and E the corresponding total energies (so that E = 
Ex + E 2 and 0(a) = 0 i(a) 02 (a)). 

We know that: 


123 



d log #i \ 

da /a-d* ’ 




d log ^2 ] 

da /a-,, ' 



d log 

da 


so that 


(73) 


On the other hand, for arbitrary a, 






and, in particular, for 



d log ^ _ 
da 


d log d log 


da 


da 


Let us assume for the moment that t? lies outside of the 
interval (i?i , tJ^), being, for example, larger than t?i and t >3 . 
Since according to the property 4 (chapter IV, section 16) the 
logarithmic derivatives of the leading functions increase with 
increasing value of the argument, we would get 



thus for a = & the right hand side of the equation (74) is 
larger than the right hand side of the equation (73) whereas 
their left hand sides are equal to one another. This contra¬ 
diction proves the incorrectness of the above assumption. This 
argument holds apparently for any physical system of the 
general type considered above. 


29. Gas pressure in an arbitrary field of force. We will 
now return to the case of the ideal monatomic gas, but will 
assume that its molecules are subject to the action of an 
external field of force, and that the potential energy of each 


124 


molecule is a function only of its position (i.e. independent of 
its velocity and of the position and velocities of other mole¬ 
cules) . 

In other words we will assume that the potential energy of 
each individual molecule can be written in the form 


(75) 




rtii 


(x? -f y? + 2?) + e.-(xv , Vi , z,), 


where e.(.ri , ?/< , z.) is the potential energy in the given field. 

We will define the gas pressure p at a given point and in a 
given direction in the same way as was done in section 27. 
The distribution law for a selected molecule of our gas can be 
written (in the notation of the section 27) in the form: 



where e, (the energy of the selected molecule) is a function 
of six dynamic coordinates of that molecule as determined by 
the formula (75), is the structure function of the system 

consisting of the entire mass of the gas minus the selected 
molecule. At a certain moment t the selected molecule is 
located witliin a certain cell of the phase space; in order that 
this molecule would hit the area As within the time interval {i, 
i -h Ai) it is necessary and sufficient that this cell is a cylinder 
with the base As and the height xAt(x > 0). The impulse 
communicated by this molecule to the area As is We are 
neglecting here the effect of the collisions between the mole¬ 
cules as well as the possible deviations from rectlinear motion 
caused by the external field; in fact, these effects vanish when 
As and A( approach to zero. 

Thus, in order to calculate the value of the mean impulse 
communicated by the selected molecule to the area As in the 
direction Ox during the time inteiwal A^, we must integrate 
the product of m<x and (76) over the six dimensional phase 
space of the selected molecule; the integration must be carried 
over the region in which x > 0 and the limits of integration 
are determined by the condition that the selected molecule lies 


125 


within the above described cylindrical volume. Since all dimen¬ 
sions of this cylinder are small quantities of the order of As 
and Ai, we can consider the values of the space coordinates 
entering into the integral as certain fixed quantities (for ex¬ 
ample, making them equal to the coordinates of a certain point 
P within the area As and towards which this area contracts 
in the limiting case). We can repeat here all the arguments of 
section 27 pertaining to the second phase of the impact, and 
come to the conclusion that in order to calculate the value of 
the total impulse we must extend the integration along the 
entire real axis Ox. Thus we can write for the mean value of 
the impulse 


^ JIf ^P* 


m) 


= w HI 


or 

^ jjj - e,) di. dy, dz, 

per unit time. As already mentioned, the integration must be 

• • • 

extended through the entire three dimensional space (x<, , z<), 

whereas the space coordinates x, , yi , 2 . may be set equal to 
the coordinates x, ?/, z at the point P at which the gas pressure 
is being calculated. In order to get the pressure of the gas we 
have to take the sum of the above expressions for all n mole¬ 
cules of the gas. This gives: 

p= § ^ /// - «•) ■ 

It is clear that in this more general case the pressure of the 
gas is different at different points since the above expression 
for p depends on the coordinates x, p, z of the point P. 

In considering an approximate expression we will consider 
only the leading terms, leaving the reader the estimate of the 


126 


corresponding errors (which can be done easily on the basis of 
the formula in chapter V). 

To approximate the expression (76) we will use the formula 
(45) of chapter V. (Although this formula was established for 
values of e, deviating from the mean value by an amount of 
the order of magnitude 0(n*^^), it is easy to verify that the 
part of the integral in the expression for p which corresponds 
to the large deviations is negligibly small as compared with 
the main part. Thus in the approximate evaluation of this 
integral we can either neglect this part or, what is more con¬ 
venient, use for the integrated function the same expression as 
in the main part. This gives us: 


where 




% w w 

dXi dyi dz, 



is a generating function of the selected molecule. 

We now introduce for the energy e< its expression given by 
(75) remembering that in this expression for the coordinates 
Xi , y. , Zi of the selected molecule must be substituted the 
coordinates y, z of the point P. Since 



Vi , Zi)] dXi dVi dz. 



“ 2 " + y? + 



dv,i dp,, dp,, 


(where = niiXi , p„, = m.Vi , p,, — rtiiZi) we obtain, after 
obvious simplifications: 



E 


^ [—i9ti(x, y, z)] 


* ffl [-^^i(Xi , Vi , Zi)] dXi dyi dzt 


127 




or since 

tt* exp du = 

exp j^— du — 





we get finally 


(77) 


exp [-t?6^(x, y, z)] 


^=557^ 

JJJ exp [—t?€i(x. , i/i , z,)] dx. dyt dz* 


In the particular case considered in section 27 «< is equal to 
zero inside of the vessel and to -h » outside of it, which gives 
us: 


n 

where V is the volume of the vessel. This is exactly the ex¬ 
pression which we obtained before. 

We will consider as an example the behavior of a gas en¬ 
closed in a vessel and subjected to the action of gravitational 
force along the negative z direction. In this case the potential 
energy is 

y, 2) = T^iQz H- U(x, y, z) 


where C/(x, y,z) =0 inside the vessel and = -h « outside of it. 

According to the formula (45) chapter V the distribution law 
of a single molecule is given by the approximate expression of 
the density: 


1 




exp 




t?7n< 


(x* + + z<) — ^rriiOZi 


1 


(78) 


128 


inside the vessel (outside the vessel the density is, of course, 
equal to zero). 

Assume for simplicity that all molecules have the same mass 
m. = m. Let us select an arbitrary point Po(a:o , yo » 2o) and 
imagine it to be surrounded by an elementary volume dvo . 
The probability that some molecule will fall within this volume 
can be obtained by multiplying dvt, by the expression (78) in¬ 
tegrated over all three velocity coordinates from — to + ® 
(here, of course, 2 , — Zq). This probability will, obviously, have 
the form 

dvo 

where A is a function of m and t?. The mean number of mole¬ 
cules which fall within the volume dva is therefore: 

nAe-""”’ dvo . 

At another point P(z, y, 2 ) within the volume dv this number 
is given by; 

nAe""'"'’ dv. 

If dv = duo the ratio of these numbers (i.e., the density relative 
to Po) is equal to 

exp [ —mfft ?(2 — 2o)]. 

If we put, as above, = l/fcT, we obtain for the relative 
density of the gas the well known “barometric^’ formula 

exp [- ^ (z - Zo) 

which is usually derived in physics on the basis of entirely 
different considerations. It is clear that formula (77) (for 
= m) leads us to an identical expression for the relative 
gas pressure. 



CHAPTER VII 


THE FOUNDATION OF THERMODYNAMICS 

30. External parameters and the mean values of external 
forces. In the previous chapters we have often considered the 
case where the energy of the molecules forming the given 
physical system depends not only on the dynamic coordinates 
of these molecules, but also on the number of parameters 
characterizing the position or the state of external bodies acting 
on the system in question. Thus, for example, in the previous 
section the quantity g, characterizing the gravitational field, 
entered in a natural way into all the pertinent preceding 
formulae. In other cases such parameters can be represented, 
for example, by the coordinates of some attraction or repulsion- 
centers. In the future such parameters will be referred to as 
external parameters. Mathematically such an external param¬ 
eter is characterized by the fact that it has the same form in 
the energy expressions for all molecules. 

However, in all the above considerations, we have assumed 
that the values of the external parameters always remain 
constant; we will now concentrate on the cases when the 
external parameters change with time. We remark that the 
energies e, of the individual molecules as well as the total 
energy E = 2e< of the system are functions of the external 
parameters which we will in general denote by X* , ■ • • , X, . A 
change of the external parameters (such as the change of the 
field of forces, the change of the position of the attraction— 
or repulsion—centers etc.) will result, generally speaking, in 
a change in the energy of the system; the point representing 
the system in its phase space will in this case execute a transi¬ 
tion from one surface of constant energy to another. This 
change of energy is due to the work done by such external 
forces as change these parameters. 

The quantity deJdX^ will be called the generalized external 

129 


130 


force acting on the i-th molecule “in the direction’* of the 
parameter X, . Similarly the quantity 



will be called the generalized external force in the direction of 
the parameter X, acting on the entire system. Apart from the 
parameters Xi , • • • , Xr , the quantity X, depends on all the 
dynamical coordinates of the given system, i.e., on the exact 
position of the representative point of the phase space. In 
particular, the quantity X, may have entirely different values 
for two different points on the same surface of the constant 
energy. Thus from the point of view of our theory the quantity 
X, is a phase function, or, in the terminology of the theory of 
probabilities, a random quanti^. Thus it is natural to ask our¬ 
selves about the mean value X, of this quantity on a given 
energy surface. According to the formula (48) chapter V we 
have; 

IL- f 4. 

ax. ~ }„ ax. vm 




where , *pi , and dVi stand for the phase space, the generating 
function, and the element of volume in the phase space of the 
selected molecule, whereas the quantity X, is related to the 
total energy of the system E in the same way as in the previous 
cases. 

But since 

= f dVi 

we have 

f , 1 a log 

S dx. 


and consequently 


dCj ^ log 

ax," d ax. 





131 


from which follows, approximately, 




1 ^ log ipi _ _ 1 ^ log ^ 

t? dX. ~ I? dX* 


where ^ = ^(t>) is the leading function of the system, de¬ 
pending, apart from t?, on all external parameters. The above 
formula represents an extremely simple expression for the de¬ 
sired mean value. 

The element of work done by the external forces when the 
external parameters change by dXi , • • • , c?X, , will be defined 
as usual, by the expression: 

SA = d\. . 

Similar to the quantities X ,, the quantity hA is a certain phase 
function (a random quantity). Calculating the mean value of 
6A for a given value E of the total energy of the system, we 
obtain: 



9 

hA =* ” 




_ 1 ’A a log ^ 

,4? ax. 



It is important to remember that the sum in the right hand 
side of the above expression does not represent the total 
differential of the function since, apart of the external 
parameters, this function ako depends on the parameter a. 


31. The volume of the gas as an external parameter. One 
of the most important parameters encountered in the study of 
gases is the volume of the vessel containing the gas. This 
volume can usually be considered as a function of only one or 
a few external parameters. Let us consider the simple case of a 
gas enclosed in a cylindrical vessel with a movable piston; here 
the shape of the vessel is completely determined by the volume 
V. Thus, if the only forces acting on the gas are due to the 
reaction of the walls, the fimction U(Xi , , 2 ,), representing 

the potential energy of the molecule at the point {xt , yi , Z()» 
is completely determined by the quantity V\ this justifies con- 


132 


sideration of the quantity V as the external parameter of our 
system. 

Let us find the expression for the mean value of the general¬ 
ized force acting along this parameter. If we consider the case 
of an ideal gas, and use the formula (54), we obtain 

<!•(<» = n 7”. 

According to the foimula (79) of the previous section, the mean 
value of the force becomes 


1 d log In 

t? dv “ i? y 


The above expression is equal in magnitude and opposite in 
sign to the expression for the pressure of the gas derived in 
the previous chapter. Thas we can consider the gas pressure 
as the mean value of the force with which this gas acts on the 
external bodies "in the direction” of the parameter V. In 
particular, the mean value of the elementary work done by 
the gas when its volume changes by dV, can be written as: 

— 6A = p dV. 


32. The second law of thermodynamics. The science of 
thermodynamics is based essentially on its two fundamental 
laws; thus, every theory pretending to represent the foundation 
of thermodynamics must prove that these two fundamental 
laws can be derived from its basic principles. Once this is done, 
the entire system of thermodynamic theor>' can be developed 
logically as a consequence of the two laws. 

The first fundamental law of thermodynamics is the law of 
conservation of energy; it is clear that any mechanical founda¬ 
tion of the theory of heat includes this law quite automatically, 
since the law of conservation of energy represents the first 
integral of the equations of motion. 

We encounter an entirely different situation in the case of 
the second fxmdamental law which, in the frame of the me- 



133 


chanical theory, presents a theorem subject to mathematical 
proof. 

^ In the customary (non statistical) treatment of thermody¬ 
namics the state of a physical system is usually characterized 
by a set of external parameters and by the temperature of 
the system. We have seen before that there are many reasons 
to assume that the parameter t? is directly connected with the 
notion of the temperature of the system, and that, on the other 
hand a knowledge of 0 is equivalent to a knowledge of the 
total energy E of the system. Thus, in the frame of our theory, 
the state of the system is completely determined if, in addition 
to the value of the external parameters, we also know the 
surface of constant energy which contains the representative 

I point of our system. Thus, in the classical treatment, we do 
not distinguish between the states of the system represented 
by different points on the same energy surface. Because of this 
fact we will agree to use the term “a thermodynamic function” 
for any quantity which is completely determined by parameters 
I?, Xi , • • • , X, . We have already encountered such “thermo¬ 
dynamic functions” in many places in the previous discussion. 
As an example, we can mention the generating function 
the total energy of the system E — —{d log ^)/d& (we use 
here the partial derivative to underline the fact that the values 
of parameters Xi , • • • , X, must be kept constant) and finally 
the mean values of the acting forces 

(l<.<r). 

It is clear that any thermodynamic function is at the same 
time a phase function subject to the condition of having a 
constant value on the surface of constant energy. Conversely, 
any phase function which is constant on a surface of given 
energy can be considered as a thermodynamic function. 

Consider the transition of a system from one of its states 
2 i(t?, Xi , • ‘ • , Xr), (determined in the above described classical 
thermodynamical sense) into another “immediately adjoining” 


134 


state 22 ( 1 ^ “h “I" t ■ * ' j H“ d\r)- The work done 

by the system in this transition is given by: 



T.- X, dK = 


«• 1 


dK 



It must be remembered here that the generalized forces X, 
are not phase, but rather thermodynamic, functions depending 
on the entire set of the phase coordinates (including, of course, 
the external parameters). These forces (and consequently also 
the work —8A) are not determined by the knowledge of the 
states Zi and Z 2 , since a given thermodynamical state of a 
system corresponds to an entire continuum of individual states 
in the sense of statistical mechanics (the entire energy surface 
in the phase space of the system). From the point of view of 
statistical mechanics the work — 6A is not determined uniquely 
by the original and final thermod>Tiamic characteristics of the 
system; in fact, it can have entirely different values depending 
on the exact positions of the representative point of the system 
in its phase space. This indicates that the quantity —SA can¬ 
not be identified with the elementary work in the sense of 
physical thermodynamics. 

As we have often seen above, such a situation is typical in 
our theory; in any attempt to build a bridge between statistical 
mechanics and any physical theory the role of physical quanti¬ 
ties is played by not phase functions themselves but rather 
by their mean values taken over a given thermodynamical state 
of a system. 

In our case the equivalent of the elementary work as con¬ 
sidered in classical thermodynamics, is not the phase function 
— BA but its mean value on a given surface of constant energy: 



9 

'Ex,d\. = 





d log ^ 
dK 



Writing dE for the total energy change of the system in its 
transition from the state Zi into the state Za we have 



135 


The above expression is determined uniquely by the states Zi 
and Za since is a thermodynamical function with a well 
defined value for each thermod>'namically described state. But 

d log <!> 

® -- ^ 


and consequently 




= i [d log 4> + Edd ]. 


Therefore, 

i}(dE — 6A) = ^ dt> + I? 2 ‘HT’ ^ ^ 

mr 0^1 vA, 

= d.? + E d\. + d log 4> = + log 4-). 

OT/ </A^ 

Thus we see that the quantity 

^(dE - Ja) 

is the total differential of a certain thermodynamic fimction. 
The above result really contains the second law of thermo¬ 
dynamics. In fact, in the classical presentation of thermo¬ 
dynamics the quantity dE is the sum of the work 5 A done by 
the external forces, and the ‘‘amount of heat’* 6Q received by 
the system during the elementary transition. Since the quan¬ 
tity 6Q is formally defined as the difference dE — 6A, it is clear 
that it need not necessarily be the total differential of some 
thennod)mamic fimction. However, the second law of thermo¬ 
dynamics tells us that the quantity bQ/T, where T is the 
absolute temperature of the system, is always a total differential. 

The most satisfactory statement of this law is as follows: 
there exists such a function of the temperature, and such a 


136 


function W of the temperature and of the external parameters 
that for any elementary change of the thermodynamic state of 
a system we have 

= dW. 

In other words the function d represents the integrating factor 
of the quantity 6 Q. 

The existence of such an integrating factor, depending only 
on temperature, represents one of the formulations of the 
second law of the thermodynamics. This is, however, exactly 
what we have proved if one considers the parameter i? to be 
directly connected with the temperature of the system. 

In classical thermodynamics the absolute temperature T is 
defined by the relation 




{k = Boltzmann’s constant) 


in terms of this integrating factor t?, the existence of which is 
postulated in the second law. Introducing 

kW = k[Ed H- log 4>(i?)] = A^log 4>(t>) - t? ^ ^ 

we can write the second law of thermodynamics in the form 



The thermodynamic fimction S is known as the entropy of the 
system. The above given argument is the complete foundation 
of the second law of thermodynamics in the frame of our theory, 
and indicates the reasonableness of the relation (81) as a 
universal postulate pertaining to any system of the assumed 

type. 

Example: for an ideal monotomic gas we have, according to 


chapter T, 



d log ^ 


3n 

2 ^^ 



-6A = pdF 




137 


so that 

SQ^dE ~Ja = = ‘^dT-h-^ dV, 

6Q ZkndT , , dV 73 , , ^ . tA 

= - 2 ~ Y ^~V ^ ^ 

from which follows 

(82) -S = I >og 'T + k’^ log V + C, 

where C is a constant. 

As we have already mentioned, we will not be concerned in 
this book with the derivation of the entire system of thermo¬ 
dynamics; this can be done on the basis of its two fundamental 
laws without relation to the statistical point of view. Our task 
was only to show that these two fundamental laws represent 
the necessary consequences of our point of view. 

However, in concluding this chapter, we must discuss in some 
detail a number of fundamental questions connected with the 
notion of entropy. 



i 


33, The properties of entropy. The notion of entropy is 
one of the most important physical notions from a theoretical 
as well as from a practical point of view. Very few other 
notions can compete with it in respect to the abundance of 
Attempts to clarify its theoretical and philosophical meaning. 

Many of these attempts are closely connected with the 
statistical interpretation of the phenomenon of heat, and are 
sometimes directly based on such an interpretation. Our prob¬ 
lem is to see to what extent such probabilistic foimdations of 
thermodynamics give a basis for certain far reaching state¬ 
ments concerning the nature of entropy. 

In the above discussions we have always defined the quantity 
as a single root of the equation 


d log €>(a) 
da 


+ = 0 . 


138 


Hence, we have seen in chapter IV, t? coincides with the value 
of the variable a for which the function 

and, consequently, also its logarithm 
(83) Ea -h log $(«) 

possess a single m inim um. But, by the definition in the previous 
section the quantity 

H- log $(i?) 

is equal to the entropy of the system except for a constant 
factor. Thus we see that the entropy can be defined (up to a 
constant factor) as the minimum value of the function (83) 
with the argument t>. This permits us to establish one im¬ 
portant property of the entropy. 

Suppose we have two systems which together form a com¬ 
posite system. We will use the indices 1 and 2 for quantities 
pertaining to these systems, and no index for the composite 
system. Since E = Ex E 2 and 

4>(a) = $i(a)<&2(a) 


we have 

Ea -h log $( 0 :) = Exa log $i(a) + E^a log 
SO that 

S = k[E& + log 

= k[Exi^ + log + k[E 2 ^ H“ log ^ 2 (*?)]* 

Since the functions Exa log $i(a) and E 2 CC + log ^a(«) 
reach a miT^imum for a = ^x and a = ^2 respectively, we obtain 
S > + log + klEi^t + log + S 2 . 

This means that the entropy of the system obtained by bring¬ 
ing into thermal interaction two previously isolated systems is 
never smaller than the siun of the entropies of the two com- 


139 


ponents; the two quantities become equal if the two com- 
. ponent-s have originally the same temperature. 

It must be noticed here that this theorem is often used, 
without sufficient foundation, in reaching rather broad con¬ 
clusions, and the theorem itself is often expressed in rather 
indefinite and exaggerated terms. For instance, one states 
that because of thermal interaction of material bodies the 
entropy of the universe is constantly increasing. It is also 
stated that the entropy of a system ‘'which is left to itself” 
must always increase; taking into account the probabilistic 
foundation of thermodynamics, one often ascribes to this 
statement a statistical rather than an absolute character. This 
formulation is wrong if only because the entropy of an isolated 
‘ system is a thermodynamic function—not a phase-function— 
which means that it cannot be considered as a random quantity; 
if E and all X, remain constant the entropy cannot change its 
value whereas by changing these parameters in an appropriate 
way we can make the entropy increase or decrease at will. 
Some authors* try to generalize the notion of entropy by con¬ 
sidering it as being a phase function which, depending on the 
phase, can assume different values for the same set of thermo¬ 
dynamical parametei-s, and try to prove that entropy so de¬ 
fined must increase, with overwhelming probability. However, 
such a proof has not yet been given, and it is not at all clear 
^ how such an artificial generalization of the notion of entropy 
( could be useful to the science of thermodynamics. 

We will arrive at a much more rational formulation of the 
problem if we will consider the given system as a part of an¬ 
other more extensive system. Let us assume that this more 
extensive system (which we will characterize by the asterisk) 
is in thermal equilibrium (compare chapter V, section 25). In 
other words, our system represents only an infinitesimally small 
part of this large system. In this case the energy E of the given 
system is no longer determined by the values of the parameters 
t?, X,(«? = t?* since, being in thermal equilibrium with the 


*Coinp. Borel, M6canique statistique classique, Paris 1926. 


140 


larger system, oiu* system has the same temperature) but is a 
random quantity whose distribution law is given approxi¬ 
mately (compare chapter V, section 20) by the density: 

m ■ 

It is clear that the relation E = —{d log 4>) /d^ cannot hold in 
this case since the left hand side of this expression is a random 
quantity, whereas the right hand side is determined exactly 
by the temperature t?* of the thermostat. Thus the quantity 
E can no longer play the same role as before with respect to 
the second law of the thermodynamics. 

However, the mean value of E, given by 



f EU{E)e-^^ d loK 4> 


is related to t? in the same way as in the case of an isolated 
system. Thus, defining the entropy of the system by the 
expression 


(84) S = k{E^ H- log 4>(t»] 

we deal with a thermodynamic function which is subject to all 
the arguments set forth in the previoas section; in particular 
t*ie proof of the second law of thermodynamics remains com¬ 
pletely unchanged. 

Using the formula (84) one can give a simple derivation of 
one of the most fundamental inequalities of thermodynamics. 
Let us assume that the system described by the numbers Ei , 
t?i , jSi , interacts thermally with another system described by 
the numbers E 2 ^ (this second system is necessarily 

considered to be a thermostat). Let the numbers E — Ex E 2 , 
i?, S characterize this composite system. Since the function 
Eia + log 4>i(a) possesses a minimum for a — we have 

Sx = klEx^x -f log 4>,(t>0] < fc[Ex^ + log 4>,(t?)]. 

On the other hand formula (84) gives us 

Sx = k[Ex^ log ^i(*?)], 


141 


where Ei is the mean energy and Si the entropy of the given 
> system considered as a part of the composite system. It follows: 

Si - Si > k^{Ei - El) 

where, because of the approximate expression (51), 


Ei= - 


d log <t>i 


Ei^ - 


d log <t>i 

dt?, 


If t?i < < ^2 , we have Ei > Ei since, as we know, the func¬ 

tion (d log 4»)/da never decreases. Therefore: 

^{Ei El) > ^2 {Ei El). 

The same relation takes place for > t? > i ?2 since in this 
case El < El . 

Thus, in all cases we have 


Si - Si > kMEx - El) = 


El - El 


which can be described by the following statement. The entropy 
increase of a system resulting from its thermal interaction with 
any other system cannot be smaller than the energy increase of the 
first system divided by the absolute temperature of the second. 

One can, of course, generalize the notion of the entropy by 
writing for any state of the system 

S = k[E^ + log 4>(i?)] 



in which case the entropy itself becomes a random quantity. 

For such a definition of entropy the second law of thermo¬ 
dynamics \o^ meaning, remaining applicable, however, to the 
mean value S which apparently is identical with the expression 
(84). In this case the distribution law of the given system in 
its phase space is given by the probability density 





from which follows 


S = —k log q. 


142 


This expression is often used to justify the statement that 
‘*the entropy of a system is proportional to the logarithm of 
the probability of the corresponding state” (Boltzmann^s 
postulate). This statement, which is absolutely meaningless in 
the case of an isolated system, obtains, as we see, some meaning 
for a system in the larger system. This can be accomplished 
however, only by using the above described generalization of 
the notion of entropy which is introduced “ad hoc”. In fact, 
one must not forget that this notion is used in connection with 
the second law of thermodynamics which loses meaning when 
the generalized definition of entropy is used. All existing 
attempts to give a general proof of this postulate must be con¬ 
sidered as an aggregate of logical and mathematical errors 
superimposed on a general confusion in the definition of the 
basic quantities.* In the most serious treatises on that subject 
(for example: R. H. Fowler “Statistical Mechanics” Cambridge 
1936) the authors refuse to accept this postulate, indicating 
that it cannot be proved, and cannot be given a sensible 
fonnidation even on the basis of the exact notions of thermo¬ 
dynamics. 

However, proceeding in this direction we can obtain some 
reasonable and rather interesting results. 

Consider two isolated systems, characterized by the indices 
1 and 2, which form a composite system (characterized by no 
indices). 

The total energy E of the composite system can be distributed 
in many different wa)^ between the two thermally interacting 
components (the probabilities of various distributions have 
been considered in detail in section 22, chapter V). Let us 
write viEi) dEi for the probability that the energy of the first 
system lies in the interval [E^ , Ex -|- dEi]. 

We know that the sum of the entropies Sx -f 8% of the 
isolated systems can never exceed the entropy S of the com¬ 
posite system so that 


•Comp, the “proof” in “Thermodynamik” by M. Planck, and the 
corresponding critique in “Statistical Mechanics” by R. H. Fowler. 


143 


5 - (iSi + S 2 ) > 0. 


It can be shown that the excess is proportional, except for 
a constant additive term, to the logarithm of the quantity 
p{Ei). This statement has the following meaning. Consider two 
systems which interact thermally forming a single system with 
the energy E and the entropy S. At some moment we isolate 
the two systems from each other thus obtaining two systems 
with energies Ei and EziEi E 2 = E), and the entropies Si 
and 52{5 i + S 2 < S). The quantities Ei and E 2 must be 
considered as random quantities with the probabihty densities 
p{Ei) and piEi). Our statement tells us that the quantity log 
p(Ei) (as well as log p{E 2 )) is connected linearly with the 
quantity S — (jSi + S 2 ). 

To prove this we must remember that, according to formula 
(27) chapter IV: 



Qi (^0 02(^2) 

n{E) 


On the other hand, because of (42) chapter V, we have, approxi¬ 
mately, 


Q(E) 


“ (2tB)*^" 


” {2TrBy'^ ■ 


In similar fashion 


^3,/k 


where 






( d* log 

\ da* y.-,. ^ 



This gives us the approximate relation: 

(86) log p(fiO = - I [S - (S. + SJ] + I log g. 


144 


The second term on the right hand side cannot be considered 
as a constant (independent of chance) quantity since Bx and 
B 2 depend on and i ?2 which are in turn connected with the 
random quantity . It can be easily shown, however, that in 
the ca^ when Ex does not deviate appreciably from its mean 
value El (for which = i?), the values of Bx and B 2 are very 
close to 

log 4>i(a )\ (d^ log $ 2 ( 0 ) 

da^ 


da 


flat) 


These quantities, which do not contain any random element 
will be denoted by B[ and B 2 . We will indicate here only the 
main steps of this proof leaving the more detailed calculations 
to the reader. 

Because of the fundamental law of composition of the gene¬ 
rating functions, the logarithms of these fimctions, as well as 
all derivatives of these logarithms, are infinitely large quantities 
of the order n where n is the number of molecules forming 
the system. In particular the quantities Ex ^ E 2 t Bx , B 2 , Bx , 
B 2 are infinitely large quantities in the above described sense. 
For the same reason the quantities Bx — B[ and B 2 "" B 2 
have orders of magnitude n(t?i — i?), and n(t ?2 “ *?) small 
values of (i?i ” t?) and (i ?2 ~ d). Thus the ratios B\IBx and 
B 2 IB 2 differ from unity by (i?i — 1 ?) and (t ?2 ” other 

hand: 

* \ da 


E 


and, approximately, (according to (51) chapter V): 

- — _ / d log #i(a) \ 

* \ da /a-tf 


E 


so that 


Ex - El ^ 




= - t» + 0[n(t>. - tf)“]. 


145 


Thus we conclude that for 





the difference — i? is an infinitesimally small quantity of 
the order: 

g. - E, 

n 


Consequently: 



U, as we have assumed, the deviation of Ex from its mean value 
El is negligibly small in comparison with n (which is extremely 
probable because, as we shall see later, the mean square devia¬ 
tion of the quantity E, is of the order of magnitude of the 
estimates given above permit us to substitute the quantities 
Bi and B 2 for the quantities Bi and B 2 in the formula (85), 
and obtain the approximate relation 

logp(£.) = 1 (S. + S, - S) + i logjg 
which proves our statement. 


34. Other thermodynamical functions. Before finishing 
this chapter we want to consider the expression of some other 
important thermodynamic functions in terms of our general 
theory. 

1. The thermodynamical 'potential or characteristic function of 
Planck 

X.) = log ^( 1 ?) 


146 


is a convenient function because of the fact that almost all 
most important thermodynamical functions of a system can be 
expressed in terms of it. Thus, 




(3 = 1, 2, • • • , r), 


etc. 

2. Heat capacity. Let us assume that the temperature and the 
external parameters of a given system are subject simultane¬ 
ously to some infinitesimal change. The heat capacity of the 
system is defined as the quantity 

^ ^ dT dT dT 


where dA is the mean value of the element of work discussed 
in section 30. It is clear that this quantity assumes different 
values for different changes of the parameters X, . 

In the particular case when all the parameters X, remain 

constant (dX, = 0) we have 


_ dE ^ (dEm) d^ 
~ dT~ d(l/M) 




log ^ 




Let us consider the special case of an ideal monatomic gas 
enclosed in a vessel, subject to no external forces except the 
reactions of the vesseVs walls. In this case the only external 
parameter will be the volume V of the gas. Since B = 
(3n)/(2t?®), the heat capacity for constant volume is given by 



Thus it is independent not only on the volume and temperature 
but also on the nature of the gas. 


147 


The heat capacity of the gas calculated under the assumption 
, of constant pressure also plays an important role in physics. 


i.e. 


p = nkTIV = const. 



In order to calculate this heat capacity, we notice that the 
entropy of the ideal gas is given according to (82) by the 
expression: 

S = k[Ei» + log 4>(t?)] = A:7 i log F + | fcn log T + const. 


Since, according to the second law of thermodynamics, 

5Q/T = dS, 


we conclude that for any changes of V and T the heat capacity 
is given by: 



» m dV , Z , 

V dT 2^ 


(in the particular case dV — 0 this reduces to the expression 
(86)). In the case of constant pressure dV and dT are related 
by (87) giving us the result: Cp = (5/2)A:n (remembering that 
’■ nkT/V = p). 

Thus the quantity CJCy = 5/3 represents a “universal 
constant** which remains the same for any amount of any mon¬ 
atomic gas under any physical conditions. This statement is 
generally in a good agreement with experiments, and the ob¬ 
served deviations can always be satisfactorily explained by the 
fact that the actual gases are never ideal. 



CHAPTER VIII 




DISPERSION AND THE DISTRIBUTIONS 

OF SUM FUNCTIONS 

35, The intermolecular correlation. Let us consider an 
isolated system consisting of a large number n of molecules, and 
let us select any two of these molecules. Let tpiP) be a phase 
function of our system depending only on the dynamic co¬ 
ordinates of the first molecule, and a phase function de¬ 
pending only on the dynamic coordinates of the second mole¬ 
cule. The functions and considered as random 

quantities, are not statistically independent of one another 
since their dynamic coordinates are related by the condition 
that the total energy E of the system remains constant. Be¬ 
cause of the large number of individual molecules composing 
the system one should expect, of course, that the dependence 
between the quantities ip{P) and ^(P) is very weak. In partic¬ 
ular, one can expect that in any calculation the coefficient of 
coiTelation between these two quantities is infinitesimally 
small. As we shall soon see this is actually true. However, in 
many problems (in particular in the calculation of the disper¬ 
sion of sum functions) one must calculate sums of a very large 
number of such coiTelation coefficients; these sums often may 
apparently diverge, and thus cannot be neglected.* For this 
reason it is necessary to have at least an approximate ex¬ 
pression for the intermolecular correlation coefficient. This 
question will be discussed in the present section. 

It is clear that in order to obtain the asymptotic formula we 
must impose certain limitations on the structure functions 
wi(x) and of the two molecules selected as well as on the 


*This is particularly so iu the case when, not being satisfied by estimat^ 
of the order of magnitude of the increase of dispersion, we want to obtain 
its asymptotic expression. This occurs, for example, in the comparison 
between theoretical and experimental results concerning fluctuations. 


148 


149 


functions (p{P) and ^(P). The point is, that we are looking for 
^ asjonptotic formulae under the assumption that the number 
of molecules n as well as the total energy E of the system 
becomes infinitely large; it is clear without any calculation that 
an unduly rapid increase of the four above mentioned functions 
would disrupt their general asymptotic relations. 

For our purposes it is quite sufficient to assume that each 
of these four functioas increases less rapidly than Cx\ where 
X is the energy (of the first or second molecule) corresponding 
to a given argument, and C and r are positive constants. This 
condition is actuall}'^ satisfied in practically all problems of 
statistical mechanics. 

. We will use the following notations: 

= structure function of the system, 

= generating function of the system, 

U(x) = conjugate distribution of the system, 

( d^ log 4>(a) \ 

\ dcL^ • 


A{^E) = 


_ ( d log 4>( 


da 


-) 

ram 


We will use letter with the indices 1, 2, and 12 to denote 
quantities for the system without the first molecule, without 
the second molecule, and without both of them respectively. 
We will further denote by ei , ^i(a), wi(x), 7 i , dv^ the energy, 
the generating function, the structure function, the phase space, 
and the volume element in it. The same letters with the index 
2 will denote the same quantities for the second molecule, 
whereas the index 12 will refer to the combination of the two 
molecules. Obviously: 


^12 = ^1 + ^ 2 , <4>\z{ol) — ^i(a)^2(a), dvx2 = dvi dvi 



and the space 712 is the direct product of 71 and 72 . 

The correlation coefficient connecting the functions ip and 
yf/ is determined by the expression 


R(<p, yp) 


(ip — <p){x// — yj/) 

(Dip Dypy'^ 



150 


where D is the symbol of dispersion. For the denominator of 
the above expression we have (according to (26) chapter IV) 
the following basic expression: 

(^ — ^)(^ — yp) 


-L 


y(P) - ^][^(P) - ^ 


Expressing the structure function in terms of the conjugate 
distribution functions (by means of (34) chapter IV), and 
putting a = we have: 


R{<P, 'P) = 


{D(p Dyp) 


1/2 


f MP) - ^mp) - lAj 

y\i 


^ ^ I 4* < t ) 


<PlWfp2W 


( 88 ) 


L{€i , 62) dV\2 


where, for brevity 

L(ei , € 2 ) 


- e, - 62 ) 

U{E) 


As usual, our aim is to secure an asjTnptotic expression for 
Ri<pf yp) under the assumption n —><». 

We begin with some general considerations. First of all we 
can see from the approximate formula (39) (chapter V) that 
the quantity L(ei , € 2 ) is bounded uniformly for any values of 
ex and c, when n —> 00 ; in fact, the quantity U(E) is asymptotic¬ 
ally equal to (27rB)"*^* (i.e. is an infini tely small quantity of 
the order whereas, becau.se of the same formula, the 

order of magnitude of — c, — e*) is in any case not 

lower than 

We will also notice that for any arbitrary constants a, b, 
c, d, and for the constant values ifc > 0, I > 0, we have for 

n —> 00 : 


= / [<^{P) + bTicex + d] V* dvx = Oin-h 

f [aHP) + 6]‘[ce. + = 0(n"), 


( 89 ) 



151 


where g is any real number. Furthermore, because of the above 
assumptions concerning the properties of tp{P) and we 

have, for example, for the first integral: 

1 /J < C [ dv, = C f dx 

< C. r V"' dx < C, r e-""”* dx = OCn-”) 

where C and Ci are positive constants. 

Turning now to the derivation of the asymptotic formula 
for the quantity R(<fi, ^), we use the formula (99) (see appendix) 
for an estimate of the quantities — €i — and 

U{E). Also, using the arguments given at the end of the 
appendix we find that for 

1 e, — ei 1 < log*n, and | Ca — Ca | < Iog®n: 

+ ^8/a + ^ 

where we put for brevity: 

= w. 

On the other hand we have: 



UiE) = + 0(n-^. 

Thus, in the part 7 ia of the space yi 2 which is characterized 
by the inequalities ei < log* n and < log* n, we have 


L(ei , ea) 



+ (2»r)‘^* 


Sn + Tn'W 

B* 



[1 + (2ir)‘^*B,3-* + 0(n-*^*)] 


152 



B 


B 


( 12 ) 


)!■ - 1 ) 


+ + {7nry'^T.B-^w 


l/2m n-2. 


+ o (^-^ y ^)]/[1 + S,B-^ + 0{n^'^] 


Noticing that 


( 


-) 

^(12)/ 


1/2 


= 1 + 


H~ ^2 
2B 


+ 0(n-^, 


and denoting for brevity 


1 + = A, 


we find 

L(ei , ea) 


A + 


bi -{• bi , Nl/im r>-2 W’ 


g + 0 


(1^) 


A + 0(n‘®^*) 


(90) 


= 1 + 


+ (2.y'^T,B-‘w - ^ + o(ii 3 y^) . 


2B 


2B 


The right hand side of this equality has the form: 


(ei - €i)(€i " €2) 

B 


-f- + 0 


(1^) 


where R is formed by the terms of the type 

“ Ci)‘ 

{K is constant, i = 1 or 2, s = 0, 1 or 2). But 


/ (v? — <p)(^ 

^yit 


yf)e 




- e,y dv 


12 


— K f (<p — <p)(ci — Ci)'e"^'* dvi f (^ — 

*'yt Jjt 



163 


According to formula (48) chapter V, the second integral 

V on the right hand side is an infinitesimally small quantity of 

order not lower than n“\ The first integral is of the same 
order of magnitude for s = 0 , and remains bounded from 
above for s = 1 or 2. Therefore the terms with s = 0 (i.e. 
1 + + & 2 )/ 2 B) in the expression for R, when integrated 

over the entire space 712 , give a quantity of the order of 
magnitude 0(n"^), in the estimate of the integral ( 88 ). All 
other terms (i.e. those with s = 1 and s = 2 ) contain a constant 
(with respect to the variables of integration) factor of the 
order n"\ giving after integration quantities of the order 
0(n"^). Thus: 

f (¥> - dv,:, = 0 (n‘='). 

11 

Since, on the other hand, because of (89): 

f {<P~ ^)(^ - = 0(71"*) 

where 7 !^ == 712 — 7(2 , is the part of the space 7 j 2 determined 
by the condition 

max (ei , e^) > log* n 

we conclude that: 

[ (<P- dv,, = 0(n-*). 

Denoting by Q the remaining terms in the formula (90), 
and by C and Ci two positive constants, we have: 

< f U-^||iA-'^l(l + |u> dv^, 

V 

= C,n-^'\ 


154 


Comparing the estimates so obtained we have: 


/ — 
*7 It 






U(E) 


dvi 


= - /..., 


i<p - <p){^ *“ 


(^1 6i)(C 2 g?) 




Since, as we have seen above, the ratio 

-e,- e,) 

U(E) ■ 

remains uniformly bounded in the entire space 712 for n —>», 
the estimate given by (89) permits us to integrate both parts 
of the above relation over the entire space 712 • Therefore, 
using ( 88 ), we obtain: 


R{<P 




(^ — <^(ei — ei) e 


- 0 * 


iD<p) 


I/a 


(piW 


dVi 


L 




(2)^) 


^a(t>) 


Using the formula (48) chapter V 


/, 


y, V>i(t?) 


dvi = <pei + 0 (n *) 


we can give similar estimates for other integrals, in which 
(p€i is replaced by <p, ei , yf/e^ , 62 . This gives 


fi(^, ^) = - I + 0(n-^ 


{D<p Dyf/) 


(91) 


= - e,) + OCn-’'”). 


Both correlation coefficients entering into this expression 
connect two phase functions of the same molecule, and can 



155 


be easily calculated if these functions are known. The basic 
> value of the asymptotic relation which is obtained in this way 
is the fact that the correlation coefficient pertaining to the 
phase functions of two different molecules is, asymptotically, 
inversely proportional to B (being an infinitesimally small 
quantity of the order n'*) for ever increasing n. 

Let us now consider the expression (91) under various pos¬ 
sible assumptions. 

1. In the case when the two selected molecules have the same 
structure, and the functions <p and ^ are the same, we obtain 
(putting 6 i = 62 = 6 ): 


Rif, V”) = - I R\v. e) + 0 (n-“^'), 

where e stands for the energy of one of the selected molecules. 

2 . If v? = and all the molecules have the same structure, 
we have B = nh and: 

R{v, v) = - I R'i'P. e) + . 

3. If ^ = ex and ^ = e* , the formula (91) gives us: 

R(e. . e») = - + 0(n-’'*), 

The negative sign of the coefficient R in this last case could 
be foreseen; in fact, since the stochastic relation between the 
energies of two molecules is determined entirely by the condi¬ 
tion that the total energy of all molecules is a constant, the 
decrease of the energy of any one molecule favors stochastically 
the increase of the energy of any other molecule and vice versa. 

4. Finally if all the molecules have the same structure and 
ifi — Cl and ^ = 62 , we have: 

«(e. . e,) = - i + 0 (n-’'*). 


V 



156 


This result is trivial since in this case the formula: 


0 = i)£ = D E e. = E fie. + i: (De, De,y'^R{e, , e.) 

k^\ k^\ tVib 

= ?i6 + n{n - l)bR{ei , e*) {i k) 
leads to the exact relation: 

, ^k) = — ^ {i ^ k). 


36. Dispersion and distribution laws of the sum functions. 
As we have seen in section 13, chapter III, the estimates of 
dispersion of sum functions play an important role in the 
foundation of our entire theory. It is, in fact, the smallness 
of the mean square deviations of these functioas which permits 
us to state that they assume values which are very close to 
their mean values. 

In the terminology of the theory of probabilities this is 
equivalent to the statement that, for an infinitely large number 
of molecules, the sum functioas are subject to the law of the 
large numbers. In view of the disciLSsions of the previous 
paragraph, obtaining estimates of dispersion of the sum func¬ 
tion does not present any essential difficulties^. 

Suppose we have a sum phase function: 

m = E f<{P) 

i-l 

where each term is a function of the dynamical coordinates of 
one molecule only. The mean value of the function / is: 

7= E7.. 

In the general case, when for increasing n the quantities /, 
remain between two limits of the same sign, the above given 
mean value is an infinitely large quantity of the order n. 

In the most common case, when all the molecules as well 



as all /.-functions, are identical, the quantity / is directly 
proportional to n. 

Let us now estimate the dispersion of the function /. We 
have 

(/< - /.)}' 


Df = \{f - f)^\ 


= [{s 


= Z l(/.' ^ /.)“! + E {(/.■ - /.)(/* - A)1 






= E + Z (D/< Df,r^R{S, , /*). 

%-l 

Assuming the functions /< are subject to the limitations intro¬ 
duced in the previous section, we obtain from the formula (91) 

DS = E W- - 4 E ihM Df, Df,y'^R{e, , f,)R(e, , /.) 

im\ D ifik 

(92) 

+ 

since the number of the terms in the second sum on the right 
hand side is of the order of n* (in the above expression we 
denoted by the energy of that molecule related to the func¬ 
tion fi , and by 6. the dispersion of this energy). The above 
relation indicates that, imder the assumed conditions: 

Df = 0(n) 

meaning that the mean square deviation of the function / is 
of order not greater than (i.e. considerably lower than the 
mean value of that function). This fact establishes the “repre- 
sentability” of the mean values of the sum functions, and 
permits us to identify them with the time averages which 
represent the direct results of any physical measurement. 

Let us consider, as an example, the energy of a large com¬ 
ponent of some given system, and let us assume that this 
component contains the molecules with numbers from 1 to 
nj(<n). We have ft = et (I < i < n,), ft = 0 {i > Ui), 


158 


/ = consequently Dfi = hi (i < Ui) Dfi = 0 

{i > Til), R{ei y fi) = 1. Formula (92) now yields 

i>/ = E - 4 E + 0(n''’). 

t-l " i.k^m 

Putting Xi-i can rewrite the above in the form 

Df = B' (b'^ - E 6?) + 0{n''^ 

or, noticing that 

4 E 6? = 0(1) 

in the form 



B'jB - BQ 
B 




The main term in this expression is just the dispersion for the 
Gaussian distribution, representing, approximately, the energy 
distribution of the large component (comp. Chapter V, section 
22 ). 

Let us also notice that in the most common case when all 
the molecules and all the functions /< are identical (/< (P) = 
^(P), 1 < t < n) the formula (92) gives us: 

Df = n D<l> - ^ Df R\4', e){n' - n) + OCn’'”) 

(93) 

= n [1 - R\^, e)] + 0(n''’^, 

where e stands for the energy of a single molecule. 

Let us turn now to a question concerning the distributions 
of sum functions. As usual, we will consider this as a limit 
problem, studying the form of the laws for n . 

We can expect without detailed calculation that we will 
encounter here the same Gaussian distribution as in the partic¬ 
ular case of the energy distribution of a large component of 



159 


a given system (comp, section 22, chapter V). In fact, since 
> any sum function represents the sum of an infinitely large 
number of random quantities, the interrelation of these quanti¬ 
ties is determined by the condition that the sum of the energies 
of individual molecules is equal to the total energy E of the 
S3rstem. 

With an ever increasing number of molecules, the correlation 
between the dynamical coordinates of any two of them be¬ 
comes very weak; we have seen, in fact, that the correlation 
coefficient of two molecules tends to zero when n —> <». Hence 
using a well known theorem of the theory of probability one 
can expect that the distribution of the sum functions for a 
\ large number of molecules will be, as a rule, similar to the 
Gaussian distribution. 

Let us consider the sum function F — fi(Qi) where 

represents the set of the dynamic coordinates of that mole¬ 
cule associated with the function/*. Let y) be the volume 
of that part of the phase space for this molecule, where e* < x, 
/<(?<) < y- Also put 


w*(x, y) 


d^Vjjx, y) 
dx By 


We will denote by F(x, y) and n(x, y) the functions which are 
determined in similar manner for the entire system (the func- 
* tions /* will, of course, be replaced by F), If we divide this 
system into two components (characterized by the indices 1 
and 2) one can easily see that 


V(o, 6) = f dV = f dV, f dV. 

J8<» •'r» 


8 <» 

F<k 





- i. >'•'» 

- // V.(. 


- , h - FO dVi 


- X, 5 - y)^i(x, y) dx dy 



160 


where the double integral is extended over the entire surface 

ix y). 

This gives us 



Q(a, b) = 


da db 



y)Sl2(a — X, b — y) dx dy. 


This formula, which is analogous to the law of composition 
for structure functions (20), can be easily extended to an 
arbitrary number of components. 

Let us now try to express the distribution law of F in terms 
of the function n(a:, y). It can be done most simply by re¬ 
ferring to the original geometrical meaning of the probability. 
We have previously defined the probability of the relation 

(95) 6 < /?’ < 6 + A6. 

We are now interested in this as the area of that part of the 
energy surface where this relation is fulfilled approaches zero. 
For this purpose we have selected the particular surface 
metrics for which the measure of any region of the surface 
2a is equal to the limit (for Aa —> 0) of a ratio in which the 
numerator is the volume measure of the layer a < ^ < a -f- Ao 
located above the given surface, and denominator is equal to 
Aa. In particular, the measure fi(a) of the entire energy surface 
is equal to V'(a). 

It follows that the probability of the relation (95) can be 
determined geometrically as the limit (for Aa —> 0) of the 
ratio of the volume of phase space for which a < E < a -h 
Aa and 6 < F < 6 + Ab, to the volume of the layer a < E < 
a -f- Aa. 

But the first of these two volumes is given by 
V{a + Aa, 6 + A6) - V(a + Ao, b) - F(a, 6 + A6) -h Via, b), 
whereas the second is 


Via + Aa) - Via). 



161 


Dividing the numerator and denominator of this ratio by 
Aa, we find that the probability in question is 

lim l[F(a + Aa, b + A6) — V{a + Aa, b) — V{a, b -j- Ab) 

Aa-*0 


+ Via, b)]/Aa\/{[Via + Aa) - F(a)]/Aal. 

The probability density p(a, b) is now obtained by dividing 
the above expression by Ab and putting Ab —» 0. This yields 

d^yja, b) 

. da db fi(a, b) 

“ dV{a) ~ !i(a) ■ 

da 

Starting from this basic formula, we will now obtain from it 
asymptotic formulae which are more convenient in further 
calculations. We will do this by using the composition law (94) 
for the function y) in the same manner as for the structure 
functions in chapter V. 

Let us assume that our system consists of a large number n 
of molecules, and let us put, as usual 

= j (jiix)e~^’ dx {1 <i < n), 

= n = f dx 

• •I ^ 

where d is determined by the relation (d log 4»)/dt? E — 0. 
Let us also denote 

(1 ^ ^ 

Q(x, y)e~" 


(96) 


U{x, y). 


162 


Since 


(97) 


J o}i(Xy y) dy = w.(x) (1 < t < n), 

J Q(x, y) dy = * 


it follows that the functions Ui(x, y) and C/(x, y) represent the 
probability densities of some two-dimensional distribution (the 
analogue of the conjugate distributions). 

Generalizing the composition law (94) for n-components we 
obtain 


fl(a, h) 


= / I n » 2/<) 2 Xo & “ 2 y<)- 

Expressing the functions 12 and through the functions U 
and Ui J we can easily find from the formulae (96) that: 

U{ay b) 

■/(B Ui{Xi , yt) dxt dyt U-s Xt fb - £ 2'*) 

This relation shows that (/(x, y) is the probability density 
of the two-dimensional distribution for the sum of n mutually 
independent terms distributed according with the densities 
Ui(x, y) (1 < i < n). Therefore, making certain assumptions 
concerning the limiting behavior of the functions and /< , 
we can use the two-dimensional central limit theorem.’ 


*This last relation can be obtained, for example, by differentiatmg the 
relation 

V(x) = 7(x, -l-oo) - 7(x, -«>) “ J^ dy, 

•Although the proof of this theorem for this particular case has never 
been published, we do not thinlc it expedient to burden the present exposi¬ 
tion by its det^ed presentation. Although this proof is very long, it does 
not present any fundamental difficulties, and the reader can develop it 
along the lines given in the appendix. 



163 


For large n, 


U{.a, b) 

y 

4 

1 

27rA 

exp ^ 

^ [B,(a - A.)’ 

+ A,(b 

-Bd 




- 2C 

.(a - A0(6 - B.)]} 



where 







A. = 

n 

E 


a. = 

JJ xUi(x, y) dx dy 

(1 < i 

< n), 

B. = 

n 

E 

• t 

, 

h, = 

jj yui(xy y) dx dy 

(1 < i 

< n), 

A. = 

«l 

E 

// 

(x - 

aiYuiix, y) dx dy, 



B, = 

ft 

E 

im\ 

// 

iy - 

hd^Uiix, y) dx dy, 



C, = 

ft 

E 

t«l 

// 

{x - 

a.)(t/ - h,)Ui{x, y) dx dy, 



A* = - d 


Because of the relations (96), (97): 




_ -A ffiiW ^ _ d log ^(t?) 
h{ <pi(d) 


= E 


*To avoid any misunderstanding one should note that the notations 
used in this derivation do not correspond to those which we used in the 
one-dimensional case. Previously we used the letter A to symbolize mo¬ 
ments of the first order and the letters B and C (with different indices) to 
symbcMze the moments of the second order; now we use these indices to 
indicate the order of the moments. 



164 


therefore, for a = E, 


U{E, b) 




Ai 

2A“ 


(b - B.) 


and, because of (96), 

Sl(E, b) = 4>(d)e 


EJ 


1 


27rA 


exp 


Ai 

2A^ 


(b 


- B.)'] 


Remembering that according to (42) 




(27rA,y'^ 


we finally find 


P(E, b) = 


Q{E, b) ^ 1 

U(E} [27r(A7^2)] 


T72 exp 


2A 


(b - B.) 


In other words the limiting form is nothing but a Gaussian 
distribution with the center at Bi and with the dispersion: 



It remains to clarify the meaning of these parameters, and to 
see whether their values coincide with those which we obtained 
before. For this purpose we will show, first of all, that w<(x, y) 
is the density of the two-dimensional probability distribution 
w'hich governs (approximately) the pair of random quantities 
€i f fi . In fact, since the set of dynamic coordinates of the 
selected molecule is distributed in the phase space 7 , of this 
molecule according to the law characterized (approximately) 
by the density (e"'’")/(<pi(*?)), the probability that the in¬ 
equalities X < ei < X -{• dx and y < fi < y •¥ dy will be 
satisfied simxiltaneously is equal to the integral of the above 
expression, extended over that part of the phase space y^ 
where the above inequalities are satisfied. Since, however, the 
integrated fimction can be considered as a constant (equal 
to hi that part of the space, and since the volume 



165 


of this region is given by [d^F<(x, y)\l[dx 6^] dx dy = y) 
dx dyy this probability becomes (e“'**/v».(t?))wi(x, y)dx dy — 
y) dx dy. This proves our statement. From this it follows 

that 


n 


B. = Z /< 


i-1 




n 


^ Dei , B2 ~ 

%• 1 


E Df<, a. 


I 


ft 


Df.y'^Rie, ,/.)• 


Therefore, for the disj>ersion associated with the limiting form 
we get 


‘ (98) B, 






i: (De. Dfd''"R(e, 



This expression coincides with the formula (92), (within the 
limits specified in its derivation). The expression (98) also leads 
to the formula (93) if one assumes that all the functions /, 
are identical, and all molecules have the same structure. 

Let us also note that in the case when the functions fi are 
not correlated with the energies Ci of the corresponding mole¬ 
cules (i.e. when all R(ei , /.-) = 0) the expression (98) leads to 
the value Bfi for the dispersion of the fimction F. 

This fact could be foreseen before; in fact, we have already seen 
^ that the stochastic dependence between the dynamic coordi¬ 
nates of different molecules is due entirely to the condition 
ei = E. It is clear, therefore, that the functions of the 
dynamic coordinates of individual molecules not being corre¬ 
lated with the energies and therefore not being subject to the 
above conditions, must behave as non-correlated random 
quantities. Thus the dispersion of their sum must be equal to 
the sum of their dispersions. 



APPENDIX 


A proof of the central limit theorem of the theory of prob¬ 
ability. We find it necessary here to give a complete proof 
of the central limi t theorem of the theory of probabilities, be¬ 
cause that form of this proof which is most convenient for 
the purposes of statistical mechanics is somewhat different 
from the form usually encountered in mathematical texts. The 
point is, that in mathematics one naturally tends to formulate 
theorems in the most general way, sacrificing, thereby, con¬ 
siderations of the accuracy of the given estimates; in the case 
of the central limit theorem one tries to give a proof which 
would hold for the broadest possible class of initial distribu¬ 
tion functions, without giving much attention to the smallness 
of the higher order terms. On the other hand in the case of the 
statistical mechanics we can limit ourselves to comparatively 
“smooth” distributions, paying more attention to a detailed 
estimate of the secondary terms. 

This difference in the points of view results in a somewhat 
different treatment of the details of the proof, and prevents us 
from simply referring the reader to the standard mathematical 
texts. It may be noticed, however, that the general idea and 
the analytical method used in the proof remain essentially un¬ 
changed, so that the competent reader could actually do this 
himself. 

The central limit theorem. Suppose we have a sequence of 
mutually independent random quantities which are governed 
by distribution functions with the probability densities Ut(x) 

(k = 1 , 2 , • • •)» 

gt(t) = J e**^Uk(x) dx (fc = 1, 2, - • •) 

represent the characteristic functions corresponding to these dis¬ 
tribution laws. Let us assume that: 



167 


1. The functions u*(x) possm continuous derivatives, and 
there exists s^^ch a constant A that 

j |u^(x) \dx < A (fc = 1, 2, 


2. The functions Uk(x) possess finite moments of the first five 
orders which we will denote by at , bt , Ct , dt , e^ • Without re¬ 
stricting the generality of our proof we can put a* = 0 (A: = 1, 
2, • • •)> there exist positive constants a and ^ such that: 

0 < a < 6* < Ci < j3, d* < /3, e* < (A: = 1, 2, • • •) 

where Ck and e* represent the absolute moments of the third and 
fifth order of the functions u*(x). 

3. There exist positive constants a and b such that for | ^ { < a: 

1 I > (A: = 1, 2, • • •)• 

4. For each interval (ci , Ca) (CiCa > 0) there exists a number 
p(ci , Ca) < 1 such that for any t within the interval (ci , cf) we 
have: 


1^*(0 I < P (A: = 1, 2, •••)• 


Let Un(x) be the probability density of the sum of the first n 
terms in the given sequence of random quantities. Then, for 
n —> 00 and \x \ < 2 log* n, we have 



Unix) 






S. + T.X 



+ 0 


( l + \x\’ 
\ n 


For any arbitrary x we have: 

[- fe] °(n) 

where quantities independent 

of Xf increasing not faster than n. 


168 


The 'proof. We start with the Maclaurin formula 

gM = + + I 9. I < 1 

the validity of which is due in this case to the fact that the 
existence of an absolute moment of a certain order implies the 
existence of the corresponding derivative of the characteristic 
fimction. Because of the postulate 2; 

®Wi(x) dx = Bk < ^ (k = 1, 2, ‘ • •)• 

Denoting by 7*(0 = log ^*(0 that branch of the logarithmic 
function which passes through zero for ^ = 0, we can easily 
prove that for ^ > 0: 

(100) 7.(0 = - I - f c* + J (r - &0 



which holds uniformly with respect to k. Let us put: 


i: b. = , i: c. = c,, 

ik-l *-l 



and substitute in the previous formula t = We will 

always assume in future that u — 0 (log n). Taking the sum 
of (100) for k from 1 to n, we find; 



IL 

2 


tu ^ 
575 t/, 


U 


6B 






which can be reduced to 



iCn 3 I Bn 4 




169 


We rewrite this relation as: 


( 101 ) 





- MlB:V + 0 



where (as always in the future) each capital letter with the 
index n denotes the sum of n real numbers, independent of u, 
which form bounded sets for n —> «>. 

As is well known from the general theory of characteristic 
functions 

UM = ^ / e-"{ n <7*(o} dt. 

We will divide the interval of integration into two parts, 
defined by | i | < log n/{Br,y'^ and \ t \ > log n/(5„)*^^ and 
will write Ii and I 2 for the values of the corresponding partial 
integrals. In order to estimate the value of 1 2 we will use a 
shorter form of the Maclaurin formula 

3.(0 = 1 - I 6* + |«I<1- 

Since 


\ < j \ x I’ti.Ca:) dx = Ct < P 


we obtain 


Qkit) I < 


1 -^ 6 * 


1 

2 


+ iir- 


Remembering that I — z < e~* for any real z, and assuming 
< 2//3 < 2/6* (jfc = 1, 2, • • •), i-e- that 1 - ^ 6*/2 > 0 
(k = 1, 2, • • •) we have: 


ff»(0 1 < exp 1^- 6. + I 1 « r] 


(fc = 1, 2, 


170 


Assuming further that | ^ | is so small the second term in the 
exponent is smaller than half of the first term, we have: 


gM I < exp 


[-f4 


Since the quantities bk are bounded from below (postulate 2), 
we conclude that, for sufficiently small t, the above inequality 
will hold for any k; thus, 


fi 9tit) 


ik-1 


< exp (—. 


Let this relation hold for [ i | <5; then, for sufficiently large n, 




and 


f 


e-'-i 


«0 «/(««)»'* 

( 102 ) 


km\ 


i 


< I e 

09 




di 


= }j)\i / 2 f du = 0(exp [—i log* n]) = 0(n"*). 

K^») Jt09 n 

For an estimate of other parts of the integral let us notice that 
because of the postulate 1 


(103) 


1 9kit) I = J e“*w*(x) dx 




< 


tr 


since — w*(—“) — 0/ Because of the postulate 4 


^The existence of the limits «*(+ ®) and ujk( — “) follows from the in- 
tegrability of the function | uf(x) 1* The integrability of the function Ui(x) 
itself leads us to the conclusion that in the limit the two above values must 

VftZUSh* 



171 


there must exist a number p, 0 < p < 1 such that for 6 < 
t < 2A 

1 9k(t) I < p < 1 (/: = 1, 2, • • •) 


so that 



< p’‘(2i4 - 5) = 0(n"") 


whereas, because of (103) we have 





dt = 0(n-*). 


Since similar estimates can be obtained for corresponding parts 
of the region i < 0, the relation (102) in conjunction with the 
last two relations gives us 

h = 0(n-^) 


which holds uniformly with respect to x. 

Let us turn now to an asymptotic estimate of the integral 

ii = ~ f n p»(o} dt 


27r(B 


L 


n 


exp 


eg VI 





Replacing the product under the integral by its asymptotic ex 
pression (101), we obtain 



1 



n 


n 


exp 






1 + 


+ l«b:V - mjb:V + o 





172 



We will denote the first four terms on the right hand side by 
Ai, As, Az, At. Since is known to be the characteristic 
function of the Gaussian distribution with center at zero and 
with dispersion 1, we have 

vip / [-*■“ [-1-] 

and since 

If r tux u^l j 


we obtain finally 

(104) A. = exp [ - ^] + 0(n-^. 

For an estimate of the remaining three integrals we will intro¬ 
duce the notation 

J u'e"’**''* du = nir (r == 1, 2, • • -)■ 

Let us notice that, for sufficiently large u 




173 


so that for sufficiently large u 


(106) 


f u^e 

V |ii| >1^9 n 


du < 


I g-.v. 

^ luIXo^ n 


du 


< exp [“i log* n] = 0(n"*) 


We will also assume that | x | < 2 log* n. 
For an estimate of A a we notice that 


r'"' , r MTU 


du 


/ Ipo n 

u® si 

•i«« It 


XU 

am y /2 exp 


[-fl 


du 



n 


n 


uV'‘''^du + 0(x^n-^'^). 


Using the estimate (105), and substituting into Aa, we find 



In similar manner the estimate of 

f cos f^i / 2 du = m 4 -h o(^ ") 

n \^nj 

gives us 

We also get 

<™> 

Collecting the estimates (104), (106), (107), (108), and re¬ 
membering that /a = 0(n"*), we find 


174 




which proves the first part of our theorem. To prove the 
second part of the theorem it is sufficient to notice that the 
integrals in A 2 , and A* remain bounded uniformly with 
respect tox(—< x < +«>) when n —and that the 
estimates of Ii and Ax are also uniform with respect to x. 

Remark. For many applications of the central limit theorem 
in the problems of statistical mechanics one often must use, 
together with the formula (109), a similar formula for i7n'(x) 
where n' is so close to n that the difference n' — n remains 
bounded for n —^00 (we often have simply n' = n — 1 or n' = 
n — 2). In these cases it is useful to remember that in wnting 
the expression for U^'{x) it is not necessary to substitute n for 
n' in all capital letters on the right hand side of (109). In 
particular, it suffices to substitute for in the radical of 
the first term, leaving all other indices unchanged. Thus, for 
1 X I < 2 log* n we can write 


UAx) 



M, 




2tB 


5/3 

n 


tv 

In fact, a simple calculation, which we leave to the reader, 
shows that substituting n' for all indices n, we change our 
expression (because the limitation of — B,«, iC„ — K^’, 
L„ — Af„ — M^') only by an infinitesimally small quantity 
of an order of magnitude lower than that of the remaining 
term. Thus, we can use or omit such a substitution at our 
convenience. 


NOTATIONS 


xudi^SiCs 

03j«q 


-yd 

UI0!;SiCs 

iCiB^uaraajduiOQ 


— O uia^SiCs ^ 
iOB^a0raa|draoQ I 


+ yd s9ino 

“ 0 JOUI JO 


ajnoaio]^ 


d ^uauoduioo 

Il-Boig 


*f) ^juQuodraoo 

dSi^'j 


f) md!jsXs 
ois«a 




The volumes <x F(x) F<(x) v(x) y.o,. 








The surface E 



Entropy 































INDEX 


Amount of heat, 135 
Avogadro law, 121 

Barometric formula, 127 
Birkhoff, 5, 9, 19, 54, 62 
Birkhoff theorem, 19, 27, 54 
Boltzmann, 2, 4, 52, 53 
Boltzmann’s constant, 121 
hypothesis, 52 
law, 91 
postulate, 142 
Borel, 139 
Bose, 6, 51 

Canonic distribution, 49, 111, 112, 
114 

Canonic system, 13 
Control limit theorem, 84, 85, 166 
Clapeyron’s formula, 121 
Components of mechanical systems, 
38, 39 

Controllable integrals, 51 
Correlation, intermolecular, 148 

Darwin, 5 
Dirac, 6, 51 

Dynamical coordinates, 15 

Einstein, 6 
Entropy, 137, 142 
Elementary work, 132, 134 
Ehrenfest P. and T., 2 
Ergodic function, 66, 68 
h 3 rpothesis, 62, 53, 54 
problem, 4, 47, 67 
External forces, 129 
parameters, 129 

Fermi, 6, 51 
Fixed integral, 51 


Fowler, 6, 11, 142 
Free integral, 51 

General dynamics, 9, 10 
Generating function, 76 
Gibbs, 3, 4, 5 

Heat capacity, 146, 147 
Hamiltonian function, 13, 15 
Hertz, P., 53 

Image point, 13 

Invariant part of phase space, 15 

Eolmogoroff, 19 

Law of composition of: 
a component, 73 
conjugate function, 80 
generating function, 78 
structure function, 41 
LiouviUe theorem, 15 

Maxwell, 2 
Maxwell law, 116 
Metric indecomposability, 28, 55 
Microcanonic distribution. 111, 112, 
113 

Mises, von, 6 

Natural motion in the phase space, 
14 

Neumann, von, 54 
Normal phase function, 57 
Normal subdivision, 57 

Oxtoby, 54 

Phase average, 29 
function, 15 
space, 14 


178 



179 


Planck, 142 

^ Planck's function, 145 

Potential, thermodynamical, 145 

Quantum statistics, 6 
Quasiergodic hypothesis, 64 

Reduced manifold, 51 
Rotational energy of molecule, 108 

Second law of thermodynamics, 132, 
135, 136 

Structure functions, 37 


Sum functions, 63, 97 
Surface of constant energy, 32 

Temperature absolute, 121 
Theorem of equipartition of energy, 
105, 106 

Thermal equilibrium, 111 
Thermodynamical function, 133 
potential, 145 
Time average, 20 
Trajectory, 14 

Ulara, 54 








) 




A CATALOGUE OF SELECTED DOVER BOOKS 
IN ALL FIELDS OF INTEREST 




A CATALOGUE OF SELECTED DOVER BOOKS 
IN ALL FIELDS OF INTEREST 


ANfERiCA's Old Masters, James T. Flexner. Four men emerged unexpectedly 
from provincial 18th century America to leadership in European art: Benjamin 
West, J. S. Copley, C. R. Peale, Gilbert Stuart. Brilliant coverage of lives and con¬ 
tributions. Revised, 1967 edition. 69 plates. 365pp. of text. 

21806-6 Paperbound $3-00 

First Flowers of Our Wilderness: American Painting, The Colonial 
Period, James T. Flexner. Painters, and regional painting traditions from earliest 
Colonial times up to the emergence of Copley, West and Peale Sr., Foster, Gustavus 
Hesselius, Feke, John Smibert and many anonymous painters in the primitive manner. 
Engaging presentation, with 162 illustrations, xxii -f 368pp. 

22180-6 Paperbound $3.50 

The Light of Distant Skies: American Painting, 1760-1835, James T. Flex¬ 
ner. The great generation of early American painters goes to Europe to learn and 
to teach: West, Copley, Gilbert Stuart and others. Allston, Trumbull. Morse; also 
contemporary American painters—primitives, derivatives, academics—who remained 
in America. 102 illustrations, xiii + 306pp. 22179-2 Paperbound $3.50 

A History of the Rise and Progress of the Arts of Design in the United 
States, William Dunlap. Much the richest mine of information on early American 
painters, sculptors, architects, engravers, miniaturists, etc. The only source of in¬ 
formation for scores of artists, the major primary source for many others. Unabridged 
reprint of rare original 1834 edition, with new introduction by James T. Flexner, 
and 394 new illustrations. Edited by Rita Weiss. 6^ x 9%. 

21695-0, 21696-9, 21697-7 Three volumes, Paperbound $15.00 

Epochs of Chinese and Japanese Art, Ernest F. Fenollosa. From primitive 
Chinese art to the 20th century, thorough history, explanation of every important art 
period and form, including Japanese woodcuts; main stress on China and Japan, but 
Tibet, Korea also included. Still unexcelled for its detailed, rich coverage of cul¬ 
tural background, aesthetic elements, diffusion studies, particularly of the historical 
period. 2nd, 1913 edition. 242 illustrations, lii -f- 439pp. of text. 

20364-6, 20365-4 Two volumes, Paperbound $6.00 

The Gentle Art of Making Enemies, James A. M. Whistler. Greatest wit of his 
day deflates Oscar Wilde, Ruskin, Swinburne; strikes back at inane critics, exhibi¬ 
tions, art journalism; aesthetics of impressionist revolution in most striking form. 
Highly readable classic by great painter. Reproduction of edition designed by 
Whistler. Introduction by Alfred Werner, xxxvi + 334pp. 

21875-9 Paperbound $3.00 



CATALOGUE OF DOVER BOOKS 


Visual Illusions: Their Causes, Characteristics, and Applications, Mat¬ 
thew Luckiesh. Thorough description and discussion of optical illusion, gc-ometric 
and perspective, particularly; size and shape distortions, illusions of color, of motion; 
natural illusions; use of illusion in art and magic, industry’, etc. Most useful today 
with op art, also for classical art. Scores of effects illustrated. Introduction by 
William H. Ittleson. 100 illustrations, xxi *!■ 252pp. 

21530*X Paperbound $2.00 

A Handbook of Anatomy for Art Students, Arthur Thomson. Thorough, vir¬ 
tually exhaustive coverage of skeletal structure, musculature, etc. Full text, supple¬ 
mented by anatomical diagrams and drawings and by photograplis of undraped 
figures. Unique in its comparison of male and female forms, pointing out differences 
of contour, texture, form. 211 figures, 40 drawings. 86 photographs, xx + 459pp. 

21163*0 Paperbound $3 50 

150 Masterpieces of Drawing, Selected by Anthony Toney. Full page reproduc¬ 
tions of drawings from the early l 6 th to the end of the 18th century, all beautifully 
reproduced: Rembrandt, Michelangelo, Diirer, Fragonard, Urs, Graf, Wouwerman, 
many others. First*rate browsing book, model book for artists, xviii 150pp. 
^Vbx 111 / 4 . 21032*4 Paperbound’$ 5.50 

The Later Work of Aubrey Beardsley, Aubrey Beardsley. Exotic, erotic, 
ironic masterpieces in full maturity: Comedy Ballet, Venus and Tannhauser, Pierrot, 
Lysistrata, Rape of the Lock, Savoy material, Ali Baba, Volponc, etc. This material 
revolutionized the art world, and is still powerful, fresh, brilliant. With The Early 
]Vork, all Beardsley’s finest work. 174 plates, 2 in color, xiv -f- 176pp. SVs x 11 . 

21817-1 Paperbound $3 75 


Drawings of Rembrandt, Rembrandt van Rijn, Complete reproduction of fabu¬ 
lously rare edition by Lippmann and Hofstede de Groot, completely rceditcd, up¬ 
dated, improved by Prof. Seymour Slive, Fogg Museum. Portraits, Biblical sketches, 
landscapes, Oriental types, nudes, episodes from classical mythology—All Rem¬ 
brandts fertile genius. Also selection of drawings by his pupils and followers. 
"Stunning volumes," Saturday Review. 550 illustrations. Ixxviii + 552pp. 
9Hxl2Vi|. 21485-0, 21486-9 Two volumes, Paperbound $10,00 

The Disasters of War, Francisco Goya. One of the masterpieces of Western civi- 
iwtion—83 etchings that record Goya's shattering, bitter reaction to the Napoleonic 
that swept through Spain after the insurrection of 1808 and to war in general, 
cprint of the first edition, with three additional plates from Boston s Museum of 
»ne Arts. All plates facsimile size. Introduction by Philip Hofer, Fogg Museum. 
v + 97pp. 9Hx8^. 21872-4 Paperbound $2.50 

Graphic Works of Odilon Redon. Largest collection of Redon’s graphic works 

172 lithographs, 28 etchings and engravings. 9 drawings. These 
include some of his most famous works. All the plates from Odilon Redon: oeuvre 

plus additional plates. New introduction and caption translations 

y Alfred Werner. 209 illustrations, xxvii + 209pp. 9Vs x 12 %- -.nn 

21966-8 Paperbound $5.00 


CATALOGUE OF DOVER BOOKS 


Design by Accident; A Book of "Accidental Effects” for Artists and 
Designers, James F. O'Brien. Create your own unique, striking, imaginative effects 
by "controlled accident" interaction of materials: paints and lacquers, oil and water 
based paints, splatter, crackling materials, shatter, similar items. Everything you do 
will be different; first book on this limitless art, so useful to both fine artist and 
commercial artist. Full instructions. 192 plates showing "accidents.” 8 in color, 
viii 4* 215pp. Q^/% x lU/i- 21942*9 Paperbound S3 '^5 

The Book of Signs, Rudolf Koch. Famed German type designer draws 493 beau¬ 
tiful symbols: religious, mystical, alchemical, imperial, property marks, runes, etc. 
Remarkable fusion of traditional and modern. Good for suggestions of timelessness, 
smartness, modernity. Text, vi + 104pp. 61/8 x 9Va- 

20162-7 Paperbound $1.25 

History of Indian and Indonesian Art, Ananda K. Coomaraswamy. An un¬ 
abridged republication of one of the finest books by a great scholar in Eastern art. 
Rich in descriptive material, history, social backgrounds; Sunga reliefs, Rajput 
paintings, Gupta temples, Burmese frescoes, textiles, jewelry, sculpture, etc. 400 
photos, viii -(- 423pp. k 9^. 21436-2 Paperbound $5.00 

Primitive Art, Franz Boas. America’s foremost anthropologist surveys textiles, 
ceramics, woodcarving, basketry, metalwork, etc.; patterns, technology, creation of 
symbols, style origins. All areas of world, but very full on Northwest Coast Indians. 
More than 350 illustrations of baskets, boxes, totem poles, weapons, etc. 378 pp. 

20025-6 Paperbound $3-00 

The Gentleman and Cabinet Maker’s Director, Thomas Chippendale. Full 
reprint (third edition. 1762) of most influential furniture book of all time, by 
master cabinetmaker. 200 plates, illustrating chairs, sofas, mirrors, tables, cabinets, 
plus 24 photographs of surviving pieces. Biographical introduction by N. Bienen- 
stock. vi 4- 249pp. 9% ^ 12%. 21601-2 Paperbound $4.00 

American Antique Furniture, Edgar G. Miller, Jr. The basic coverage of all 
American furniture before 1840. Individual chapters cover type of furniture— 
clocks, tables, sideboards, etc.—chronologically, with inexhaustible w’ealth of data. 
More than 2100 photographs, all identified, commented on. Essential to all early 
American collectors. Introduction by H. E. Keyes, vi 4- 1106pp. 7% x 10%. 

21599-7, 21600-4 Two volumes, Paperbound $11.00 

Pennsylvania Dutch American Folk Art. Henry J. Kauffman. 279 photos. 
28 drawings of tulipware, Fraktur script, painted tinware, toys, flowered furniture, 
quilts, samplers, hex signs, house interiors, etc. Full descriptive text. Excellent for 
tourist, rewarding for designer, collector. Map. I46pp. 7% x 10%. 

21205-X Paperbound $2.50 

Early New England Gravestone Rubbings, Edmund 'V. Gillon, Jr. 43 photo¬ 
graphs, 226 carefully reproduced rubbings show heavily symbolic, sometimes 
macabre early gravestones, up to early 19th century. Remarkable early American 
primitive art, occasionally strikingly beautiful; always powerful. Text, xxvi -f- 
207pp. 8% X 11%. 21380-3 Paperbound $3-50 



CATALOGUE OF DOVER BOOKS 


Alphabets and Ornaments, Ernst Uhner. Well-known pictorial source for 
decorative alphabets, script examples, cartouches, frames, decorative title pat;es, calli¬ 
graphic initials, borders, similar material. l-4th to 19th centur>’, mostly European. 
Useful in almost any graphic arts designing, varied styles. 750 illustrations. 256pp. 
7 X 10- 21905-4 Papcrbound $ 1.00 

Painting; A Creative Approach, Norman Colquhoun. For the beginner simple 
guide provides an instructive approach to painting: major stumbling blocks for 
beginner; overcoming them, technical points; paints and pigments: oil painting; 
watercolor and other media and color. New section on "plastic ' paints. Glossar)’. 
Eo:mti:\y Pahn Your Ou ft Pictures. 221pp. 22000-1 Paperbound $1.75 

The Enjoyment and Use of Color. Walter Sargent. Explanation of the rela¬ 
tions between colors themselves and between colors in nature and art. including 
hundreds of little-known facts about color values, intensities, effects of high and 
low illumination, complementary colors. Many practical hints for painters, references 
to great masters. 7 color plates, 29 illustrations, x + 274pp. 

20944-X Paperbound $2.75 

The Notebooks of Leonardo Da Vinci, compiled and edited by Jean Paul 
Richter. 1566 extracts from original manuscripts reveal the full range of Leonardo’s 
versatile genius: all his writings on painting, sculpture, architecture, anatomy, 
astronomy, geography, topography, physiology, mining, music, etc., in both Italian 
and English, with 186 plates of manuscript pages and more than 500 additional 
drawings. Includes studies for the Last Supper, the lost Sforza monument, and 
other works. Total of xlvii + 866pp. 7% x lO?/^, 

22572-0, 22573-9 Two volumes, Paperbound .Sll.OO 


Montgomery Ward Catalogue of 1895. Tea gowns, yards of flannel and 
pillow-case lace, stereoscopes, books of gospel hymns, the New Improved Singer 
Sewing Machine, side saddles, milk skimmers, straight-edged razors, high-button 
shoes, spittoons, and on and on . . . listing some 25,000 items, practically all illus¬ 
trated. Essential to the shoppers of the 1890’s, it is our truest record of the spirit of 
the period. Unaltered reprint of Issue No. 57, Spring and Summer 1895. Introduc¬ 
tion by Boris Emmet. Innumerable illustrations, xiii -f 624pp. SVaxlls/s- 

22377-9 Paperbound $6.95 


Crystal Palace Exhibition Illustrated Catalogue (London, 1851). 
<^e of the wonders of the modern world—the Crystal Palace Exhibition in which 
2 the nations of the civilized world exhibited their achievements in the arts and 
sciences—presented in an equally important illustrated catalogue. More than 1700 
items pictured with accompanying text—ceramics, textiles, cast-iron work, carpets, 
pianos, sleds, razors, wall-papers, billiard tables, beehives. silverw.irc and hundreds 
pother artifacts—represent the focal point of Victorian culture in the Western 
World. Probably the largest collection of Victorian decorative art ever assembled— 
indispensable for antiquarians and designers. Unabridged republication of the 
Art-Journal Catalogue of the Great Exhibition of 1851. with all terminal essays. 
New introduction by John Gloag, F.S.A. xxxiv + 426pp. 9 x 12 . 

22503-8 Paperbound $5.00 


CATALOGUE OF DOVER BOOKS 


A History of Costume, Carl Kohler. Definitive history, based on surviving pieces 
of clothing primarily, and paintings, statues, etc. secondarily. Highly readable text, 
supplemented by 594 illustrations of costumes of the ancient Mediterranean peoples, 
Greece and Rome, the Teutonic prehistoric period; costumes of the Middle Ages, 
Renaissance, Baroque, 18th and 19th centuries. Clear, measured patterns are pro¬ 
vided for many clothing articles. Approach is practical throughout. Enlarged by 
Emma von Sichart. 464pp. 21030-8 Paperbound sS3-50 

Oriental Rugs, Antique and Modern, Walter A. Hawley. A complete and 
authoritative treatise on the Oriental rug—where they are made, by whom and how, 
designs and symbols, characteristics in detail of the six major groups, how to dis¬ 
tinguish them and how' to buy them. Detailed technical data is provided on periods, 
weaves, warps, wefts, textures, sides, ends and knots, although no technical back¬ 
ground is required for an understanding. 11 color plates, 80 halftones, 4 maps, 
vi 320pp. 61/8 X 91 / 8 . 22366-3 Paperbound $5.00 

Ten Books on Architecture, Vitruvius. By any standards the most important 
book on architecture ever written. Early Roman discussion of aesthetics of building, 
construction methods, orders, sites, and every other aspect of architecture has in¬ 
spired, instructed architecture for about 2,000 years. Stands behind Palladio, 
Michelangelo, Bramante, Wren, countless others. Definitive Morris H. Morgan 
translation. 68 illustrations, xii + 331pp. 20645-9 Paperbound $3.00 

The Four Books of Architecture, Andrea Palladio. Translated into every 
major Western European language in the two centuries following its publication in 
1570, this has been one of the most influenti.il books in the history of architecture. 
Complete reprint of the 1738 Isaac Ware edition. New introduction by Adolf 
Placzek, Columbia Univ. 216 plates, xxii + 110pp. of text. 9 V 2 x 12^. 

21308-0 Clothbound $12.50 


Sticks and Stones: A Study of American Architecture and Civilization, 
Lewis Mumford.One of the great classics of American cultural history. American 
architecture from the medieval-inspired earliest forms to the early 20th century; 
evolution of structure and style, and reciprocal influences on environment. 21 photo¬ 
graphic illustrations. 238pp. 20202-X Paperbound $2.00 


The American Builder's Companion, Asher Benjamin. The most widely used 
early 19th century architectural style and source book, for colonial up into Greek 
Revival periods. Extensive development of geometry of carpentering, construction 
of sashes, frames, doors, stairs; plans and elevations of domestic and other buildings. 
Hundreds of thousands of houses were built according to this book, now invaluable 
to historians, architects, restorers, etc. 1827 edition. 59 plates. Il4pp. 7% x 10^- 

22236-5 Paperbound $3-50 


Dutch Houses in the Hudson Valley Before 1776, Helen Wilkinson RO'* 
nolds. The standard survey of the Dutch colonial house and outbuildings, with con¬ 
structional features, decoration, and local history associated with individual home¬ 
steads. Introduction by Franklin D. Roosevelt. Map. 150 illustrations. 469pp- 
65/8 X 934 . 21469-9 Paperbound $5.00 



CATALOGUE OF DOVER BOOKS 


The Architecture of Country Houses, Andrew J. Downing. Together with 
Vaux s Villas and Cottages this is the basic book for Hudson River Gothic architec¬ 
ture of the middle Victorian period. Full, sound discussions of general aspects of 
^ housing, architecture, style, decoration, furnishing, togetlier with scores of detailed 
house plans, illustrations of specific buildings, accompanied by full text. Perhaps 
the most influential single American architectural book. 1850 edition. Introduction 
by J. Stewart Johnson. 321 figures, 3-1 architectural designs, xvi -j- 560pp. 

22003-6 Paperbound S i .00 

Lost Examples of Colonial Architecture, John ^^ead Howells. Full-page 
photographs of buildings that have disappeared or been so altered as to be denatured, 
including many designed by major early American architects. 245 plates, xvii + 
248pp. 7y8Xl03^. 21143-6 Paperbound S3.50 

Domestic Architecture of the American Colonies and of the Early 
Republic, Fiske Kimball. Foremost architect and restorer of Williamsburg and 
Monticello covers nearly 200 homes between 1620-1825. Architectural details, con¬ 
struction, style features, special fixtures, floor plans, etc. Generally considered finest 
work in its area. 219 illustrations of houses, doorways, windows, capital mantels. 

• XX -f 314pp. 1 % X 103 / 4 . 21743-4 Paperbound S4.00 


Early American Rooms: 1650-1858, edited by Russell Hawes Kettell. Tour of 12 
rooms, each representative of a different era in American history and each furnished, 
decorated, designed and occupied in the style of the era. 72 plans and elevations, 
8 'page color section, etc., show fabrics, wall papers, arrangements, etc. Full de¬ 
scriptive text, xvii + 200 pp. of text. %% x IIV 4 . 

21633-0 Paperbound $5.00 

The Fitzwilliam Virginal Book, edited by J. Fuller Maitland and W. B. Squire. 
Full modern printing of famous early 17th-century ms. volume of 300 works by 
Morley, Byrd, Bull, Gibbons, etc. For piano or other modern keyboard instrument; 
easy to read format, xxxvi -f 938pp. x 11 . 

21068-5, 21069-3 Two volumes, Paperbound $10.00 

IT Keyboard Music, Johann Sebastian Bach. Bach Gesellschaft edition. A rich 
flection of Bach's masterpieces for the harpsicliord: the six English Suites, six 
French Suites, the six Partitas (Clavieriibung part 1), the Goldberg Variations 
(Qavierubung part IV), the fifteen Two-Part Inventions and the fifteen Three-Part 
infonias. Clearly reproduced on large sheets with ample margins; eminently play¬ 
able. vi 4- 312pp, 8 V& X 11. 22360-4 Paperbound $5.00 

.^^S*c of Bach: An Introduction, Charles Sanford Terry. A fine, non¬ 
technical introduction to Bach's music, both instrumental and vocal. Covers organ 
rousic, chamber music, passion music, other types. Analyzes themes, developments, 
innovations, x -f ll4pp. 21075-8 Paperbound SI.50 

Beethoven AND Hrs Nine Symphonies, Sir George Grove. Noted British musi- 
co ogisl provides best history, analysis, commentary on symphonies. Very thorough, 
rigorously accurate; necessary to both advanced student and amateur music lover. 
436 musical passages, vii -j- 407 pp. 20334-4 Paperbound $2.75 


CATALOGUE OF DOVER BOOKS 


Johann Sebastian Bach, Philipp Spitta. One of the great classics of musicology, 
this definitive analysis of Bach’s music (and life) has never been surpassed. Lucid, 
nontechnical analyses of hundreds of pieces (30 pages devoted to St. Matthew Pas¬ 
sion, 26 to B Minor Mass). Also includes major analysis of 18lh-century music, 
450 musical examples. 40-page musical supplement. Total of xx + 1799pp. 

(EUK) 22278-0, 22279-9 Two volumes, Clothbound $17.50 

Mozart and His Piano Concertos, Cuthbert Girdlestone. The only full-length 
study of an important area of Mozart's creativity. Provides detailed analyses of all 
23 concertos, traces inspirational sources. 417 musical examples. Second edition. 
509pp. 21271-8 Paperbound $3-50 

The Perfect Wagnerite: A Commentary on the Niblung’s Ring, George 
Bernard Shaw. Brilliant and still relevant criticism in remarkable essays on 
Wagner’s Ring cycle, Shaw’s ideas on political and social ideology behind the 
plots, role of Leitmotifs, vocal requisites, etc. Prefaces, xxi + 136pp. 

(USO) 21707-8 Paperbound $1.75 

Don Giovanni, W. A. Mozart, Complete libretto, modern English translation; 
biographies of composer and librettist; accounts of early performances and critical 
reaction. Lavishly illustrated. All the material you need to understand and 
.jppreciate this great work. Dover Opera Guide and Libretto Series; translated 
and introduced by Ellen Bleiler. 92 illustrations. 209pp- 

21134-7 Paperbound $2.00 

Basic Electricity. U. S. Bureau of Naval Personel. Originally a training course, 
best non-technical coverage of basic theory of electricity and its applications. Funda¬ 
mental concepts, batteries, circuits, conductors and wiring techniques, AC and DC, 
inductance and capacitance, generators, motors, transformers, magnetic amplifiers, 
synchros, servomechanisms, etc. Also covers blue-prints, electrical diagrams, etc. 
Many questions, with answers. 349 illustrations, x + 448pp. 61/2 ^ ^Va- 

20973-3 Paperbound S3.50 

Reproduction of Sound, Edgar Villchur. Thorough coverage for laymen of 
high fidelity systems, reproducing systems in general, needles, amplifiers, preamps, 
loudspeakers, feedback, explaining physical background. ”A rare talent for making 
technicalities vividly comprehensible," R. Darrell, High Fidelity. 69 figures, 
iv + 92pp. 21515-6 Paperbound Sl..^5 

Hear Me Talkin’ to Ya: The Story of Jazz as Told by the Men Who 
Made It, Nat Shapiro and Nat Hentoff. Louis Armstrong, Fats Waller, Jo Jones. 
Clarence Williams, Billy Holiday, Duke Ellington, Jelly Roll Morton and dozens 
of other jazz greats tell how it was in Chicago’s South Side, New’ Orleans, depres¬ 
sion Harlem and the modern West Coast as jazz was born and grew, xvi + 429pp- 

21726-4 Paperbound $3-00 


Fables of Aesop, translated by Sir Roger L’Estrange. A reproduction of the very 
rare 1931 Paris edition; a selection of the most interesting fables, together with 50 
imaginative drawings by Alexander Calder. v + I28pp. 6V^x9V^. 

21780-9 Paperbound $1.50 



CATALOGUE OF DOVER BOOKS 


Against THE Grain (A Rebours), Joris K. Huysmans. Filled with weird images, 
evidences of a bizarre imagination, exotic experiments with hallucinatDry drugs, 
rich tastes and smells and the diversions of its sybarite hero Due Jean des Esseintes, 
this classic novel pushed 19th-century literary- decadence to its limits. Full un¬ 
abridged edition. Do not confuse this with abridged editions generally sold. Intro¬ 
duction by Havelock Ellis. xlix + 206pp. 22190-3 Paperbound $2.50 

Variorum Shakespeare: Hamlet. Edited by Horace H. Furness; a landmark 
of American scholarship. Exhaustive footnotes and appendices treat all doubtful 
words and phrases, as well as suggested critical emendations throughout the play’s 
history. First volume contains editor’s own text, collated with all Quartos and 
Folios. Second volume contains full first Quarto, translations of Shakespeare’s 
sources (Belleforest, and Saxo Grammaticus), Dcr Bestrafte Brudermord, and many 
essays on critical and historical points of interest by major authorities of past and 
present. Includes details of staging and costuming over the years By far the 
best edition available for serious students of Shakespeare. Total of xx + 905pp- 

21004-9, 21005-'7, 2 volumes, Paperbound $7.00 

A Life of William Shakespeare, Sir Sidney Lee. This is the standard life of 
Shakespeare, summarizing everything known about Shakespeare and his plays. 
Incredibly rich in material, broad in coverage, clear and judicious, it has served 
thousands as the best introduction to Shakespeare. 1931 edition. 9 plates, 
xxix + 792pp. 21967-4 Paperbound $4.50 

Masters of the Drama, John Gassner. Most comprehensive history of the drama 
in print, covering every tradition from Greeks to modern Europe and America, 
including India, Far East, etc. Covers more than 800 dramatists, 2000 plays, with 
biographical material, plot summaries, theatre history, criticism, etc. ’’Best of its 
kind in English,” New Republic. 77 illustrations, xxii + 890pp. 

20100-7 Clothbound $10.00 

The Evolution of the English Language, George McKnight. The growth 
of English, from the 14th century to the present. Unusual, non-technical account 
presents basic information in very interesting form: sound shifts, change in grammar 
and syntax, vocabulary growth, similar topics. Abundantly illustratet''^«vith quota¬ 
tions. Formerly Modern English in the Making, xii 590pp. 

21932-1 Paperbound $4.00 

An Etymological Dictionary of Modern English, Ernest Wcekley. Fullest, 
richest work of its sort, by foremost British lexicogr.ipher. Detailed word histories, 
including many colloquial and archaic words; extensive quotations. Do not con- 
use this with the Concise Etymological Dictionary, which is much abridged. Total 
of xxvii + 830pp. 61/2 x 91 / 4 . , . 

21873-2, 21874-0 Two volumes, Paperbound $7.90 

Flatland: a Romance of Many Dimensions. E. A. Abbott. Classic of 
science-fiction explores ramifications of life in a two-dimensional world, and what 
happens when a three-dimensional being intrudes. Amusing reading, but also use- 
ul ^ introduction to thought about hyperspacc. Introduction by Banesh Hoffmann, 
lu illustrations, xx + 103pp. 20001-9 Paperbound $1-^5 


CATALOGUE OF DOVER BOOKS 


Poems of Anne Bradstreet, edited with an introduction by Robert Hutchinson. 
A new selection of poems by America's first poet and perhaps the first significant 
woman poet in the English language. 48 poems display her development in works 
of considerable variety—love poems, domestic poems, religious meditations, formal 
elegies, "quaternions," etc. Notes, bibliography, viii + 222pp. 

22160*1 Paperbound $2.50 

Three Gothic Novels: The Castle of Otranto by Horace Walpole; 
Vathek by William Beckford; The Vampyre by John Polidori, with Frag¬ 
ment of a Novel by Lord Byron, edited by E. F. Bleiler. The first Gothic 
novel, by Walpole; the finest Oriental tale in English, by Beckford; powerful 
Romantic supernatural story in versions by Polidori and Byron. All extremely 
important in history of literature; all still exciting, packed w'ilh supernatural 
thrills, ghosts, haunted castles, magic, etc. xl + 291 pp. 

21232-7 Paperbound $2.50 

The Best Tales of Hoffmann, E. T. A. Hoffmann. 10 of Hoffmann’s most 
important stories, in modern re-editings of standard translations: Nutcracker and 
the King of Mice, Signor Formica, Automata, The Sandman, Rath Krespel, The 
Golden Flowerpot, Master Martin the Cooper, The Mines of Falun, The King's 
Betrothed, A New Year's Eve Adventure. 7 illustrations by Hoffmann. Edited 
by E. F. Bleiler. xxxix + 419pp. 21793-0 Paperbound ^3 00 

Ghost and Horror Stories of Ambrose Bierce, Ambrose Bierce. 23 strikingly 
modern stories of the horrors latent in the human mind: The Eyes of the Panther, 
The Damned Thing, An Occurrence at Owl Creek Bridge, An Inhabitant of Carcosa, 
etc., plus the dream-essay. Visions of the Night. Edited by E. F. Bleiler. xxii 
4- 199pp. 20767-6 Paperbound $1.50 

Best Ghost Stories of J. S. LeFanu, J. Sheridan LeFanu. Finest stories by 
Victorian master often considered greatest supernatural writer of all. Carmilla, 
Green Tea, The Haunted Baronet, The Familiar, and 12 others. Most never before 
available in the U. S. A. Edited by E. F. Bleiler. 8 illustrations from Victorian 
publications, xvii + 467pp. 20415-4 Paperbound $3.00 


Mathematical Foundations of Information Theory, A. I. Khinchin. Com¬ 
prehensive introduction to work of Shannon, McMillan, Feinstein and Khinchin, 
placing these investigations on a rigorous mathematical basis. Covers entropy 
concept in probability theory, uniqueness theorem. Shannon’s inequality, ergodic 
sources, the E property, martingale concept, noise, Feinstein's fundamental lemma, 
Shanon’s first and second theorems. Translated by R. A. Silverman and M. D. 
Friedman, iii + 120pp. 60434-9 Paperbound $2.00 


Seven Science Fiction Novels, H. G. Wells. The standard collection of the 
great novels. Complete, unabridged. First Men in the Moon, Island of Dr. Moreau, 
War of the Worlds, Food of the Gods, Invisible Man, Time Machine, In the Days 
of the Comet. Not only science fiction fans, but every educated person owes it to 
himself to read these novels. 1015pp. (USO) 20264-X Clothbound $6.00 



CATALOGUE OF DOVER BOOKS 


Last and First Men and Star Maker. Two Science Fiction Novels, Olaf 
Stapledon. Greatest future histories in science fiction. In the first, human intelli¬ 
gence IS the ”hero,'’ through strange paths of evolution, interplanetary invasions, 
incredible technologies, near extinctions and reemergences. Star Maker describes the 
quest of a band of star rovers for intelligence itself, through time and space: weird 
inhuman civilizations, crustacean minds, symbiotic worlds, etc. Complete un¬ 
abridged. V -V 438pp. (USO) 21962-3 Paperbound $2.50 

Three Prophetic Novels, H. G. Wells. Stages of a consistently planned future 
for ma^md the SUeper Wakes, and A Ssory of the Days to Come, anticipate 

Brave New World and 1984, in the 2Ist Century; The Time Machine, only cL- 
^ete version m print, shows farther future and the end of mankind. All show 
Wells s greatest gifts as storyteller and novelist. Edited by E. F. Bleiler x 

+ (USO) 20605-X Paperbound $2.50 

The Devil's Dictionary. Ambrose Bierce. America's own Oscar Wilde— 

iconoclastic wisdom in over 1.000 definitions 
ailed by H. L. Mencken as some of the most gorgeous witticisms in the Enelish 
language. 145pp. 20487-1 Paperbound $1.25 

Max and Moritz, Wilhelm Busch. Great children’s classic, father of comic 

r f i Moritz. Also Ker and Plunk (Plisch und Plumm) 

*ce-Peter, The Boy and the Pipe, and five other 
pieces. Original German, with English translation. Edited by H. Arthur Klein- 
translations by various hands and H. Arthur Klein, vi + 2l6pp. 

20181-3 Paperbound $ 2.00 

IFavorites, Ellis Parker Butler. The title story is one 
of the best humor short stones, as Mike Flannery obfuscates biology and English 
Also included. That Pup of Murchison's. The Great American Pie Company and 
Perkms of Portland. 14 illustrations. v+ 109pp. 21532-6 Paperbound $ 1.25 

genius to be as stupidly mad as 
and become wise, celebrate the "Fourth," *keep a cow 

Md otherwise strain the resources of the Lady from Philadelphia Basic^book of 
American humor. 153 illustrations. 219pp. 20794-3%aperbound $2.00 

34*^^^'^ Tales, translated by A. E. Johnson and S. R. Litticwood, with 

n""' by Gustave Dore. All the original Perrault stories- 

SinmK ; u® c”'*'''’ Little Red Riding Hood, Puss in Boots, Tom 

DorA n magnificent illustrations of 

uorc^ One of the five or six great books of European fairy tales, viii 117 pp. 

® 22311-6 Paperbound $2.00 

°y“u*roTtre Baroness Orczy. Favorites translated and adapted 

fL-Rv " •■ThP^T'^^ i of Princess 

as it has for pon adventure will captivate children 

It has for generations. 90 drawings by Montagu Barstow. 96pp. 

(USO) 22293-4 Paperbound $1.95 


CATALOGUE OF DOVER BOOKS 


The Red Fairy Book, Andrew Lang. Lang’s color fairy books have long been 
children s favorites. This volume includes Rapunzel, Jack and the Bean-stalk and 
35 other stories, familiar and unfamiliar. 4 plates, 93 illustrations x + 367pp. 

21673-X Paperbound $2.50 

The Blue Fairy Book, Andrew Lang. Lang’s tales come from all countries and all 
times. Here are 37 tales from Grimm, the Arabian Nights, Greek Mythology, and 
other fascinating sources. 8 plates, 130 illustrations, xi + 390pp. 

21437-0 Paperbound $2.75 

Household Stories by the Brothers Grimm. Classic English-language edition 
of the well-known tales — Rumpelstiltskin, Snow White, Hansel and Gretel, The 
Twelve Brothers, Faithful John, Rapunzel, Tom Thumb (52 stories in all). Trans¬ 
lated into simple, straightforward English by Lucy Crane. Ornamented with head- 
pieces, vignettes, elaborate decorative initials and a dozen full-page illustrations by 
Walter Crane, x + 269pp. 21080-4 Paperbound |2.00 

The Merry Adventures of Robin Hood, Howard Pyle. The finest modern ver¬ 
sions of the traditional ballads and tales about the great English outlaw. Howard 
Pyle s complete prose version, with every word, every illustration of the first edition. 
Do not confuse this facsimile of the original (1883) with modern editions that 
change text or illustrations. 23 plates plus many page decorations, xxii + 296pp. 

22043-5 Paperbound $2.75 

The Story of King Arthur and His Knights, Howard Pyle. The finest chil¬ 
dren s version of the life of King Arthur; brilliantly retold by Pyle, with 48 of his 
most imaginative illustrations, xviii -}- 313pp. 61/8 x 914 . 

21445-1 Paperbound $2.50 

The Wonderful Wizard of Oz, L. Frank Baum. America’s finest children's 
book in facsimile of first edition with all Denslow illustrations in full color. The 
edition a child should have. Introduction by Martin Gardner. 23 color plates, 
scores of drawings, iv + 267pp. 20691-2 Paperbound $2.50 

The Marvelous Land of Oz, L. Frank Baum. The second Oz book, every bit as 
imaginative as the Wizard. 'Hie hero is a boy named Tip, but the Scarecrow and the 
Tin Woodman are back, as is the Oz magic. 16 color plates, 120 drawings by John 
R. Neill. 287pp. 20692-0 Paperbound $2.50 

The Magical Monarch of Mo, L. Frank Baum. Remarkable adventures in a land 
even stranger than Oz. 'The best of Baum’s books not in the Oz series. 15 color 
plates and dozens of drawings by Frank Verbeck. xviii + 237pp. 

21892-9 Paperbound $2.25 

The Bad Child’s Book of Beasts, More Beasts for Worse Children, A 
Moral Alphabet, Hilaire Belloc. Three complete humor classics in one volume. 

Be kind to the frog, and do not call him names . . . and 28 other whimsical animals. 
Familiar favorites and some not so well known. Illustrated by Basil Blackwell. 
156pp. (USO) 20749-8 Paperbound .SI.50 



CATALOGUE OF DOVER BOOKS 


East O’ the Sun and West O’ the Moon, George W. Dascnt. Considered the 
best of all translations of these Norwegian folk tales, this collection has been enjoyed 
by generations of children (and folklorists too). Includes True and Untrue, Why the 
Sea is Salt, East O’ the Sun and West O’ the Moon, Why the Bear is Stumpy-Tailed, 
Boots and the Troll, The Cock and the Hen, Rich Peter the Pedlar, and 52 more. 
The only edition with all 59 tales. 77 illustrations by Erik Werenskiold and Theodor 
Kittelsen. xv + 4l8pp. 22521-6 Paperbound S5.50 

GOOPS AND How TO BE Them, Gelett Burgess. Classic of tongue-in-cheek humor, 
masquerading as etiquette book. 87 verses, twice as many cartoons, show mis¬ 
chievous Goops as they demonstrate to children virtues of table manners, neatness, 

courtesy, etc. Favorite for generations, viii + 88 pp. 6^/2 x 9 V 4 - 

22233-0 Paperbound $1.50 


Alice’s Adventures Under Ground, Lewis Carroll. The first version, quite 
different from the final Alice in Wonderland, printed out by Carroll himself with 
his own illustrations. Complete facsimile of the '‘million dollar manuscript Carroll 
gave to Alice Liddell in 1864. Introduction by Martin Gardner, viii + 96pp. Title 
and dedication pages in color. 21482-6 Paperbound $1.25 

The Brownies, Their Book, Palmer Cox. Small as mice, cunning as foxes, exu¬ 
berant and full of mischief, the Brownies go to the zoo, toy shop, seashore, circus, 
etc., in 24 verse adventures and 266 illustrations. Long a favorite, since their first 

appearance in St. Nicholas Magazine, xiI44pp. 6y8x9V4- 

21265-3 Paperbound $1.75 


Songs of Childhood, Walter Do La Mare. Published (under the pseudonym 
Walter Ramal) when De La Mare was only 29. this charming collection has long 
been a favorite children’s book. A facsimile of the first edition in paper, poems 

capture the simplicity of the nursery rhyme and the ballad, including such yncs as 
I Met Eve, Tartary, Tlie Silver Penny, vii-|- 106 pp. (USO) 21972-0 OO 


The Complete Nonsense of Edward Lear, Edward Lear. The finest I9t i-cen ury 
humorist-cartoonist in full: all nonsense limericks, zany alphabets. Ow an 
cat, songs, nonsense botany, and more than 500 illustrations by Lwr imse . i 
by Holbrook Jackson, xxix + 287pp. (USO) 20167-8 Paperbound S2.0C 


Billy Whiskers: The Autobiography of a Goat, Frances Trego 
A favorite of children since the early 20 th century, here are the escapades c 
rambunctious, irresistible and mischievous goat—Billy Whiskers. u ’ 
spirit of Peck’s Bad Boy. this is a book that children never tire of reading or hearing 
All the original familiar illustrations by W. H. Fry are included: 

18 black and white drawings. 159pp. 22345-0 Paperbound $2.00 

Mother Goose Melodies. Faithful republication of the 

and Francis "copyright 1833’’ Boston edition—the most important 0 

collection, usually referred to as the "original." Familiar rhymes i28dd 

ones, with wonderful old woodcut illustrations. Edited by «i no 

41/2x63/8. 22577-1 Paperbound$1.00 


CATALOGUE OF DOVER BOOKS 


Two Little Savages; Being the Adventures of Two Boys Who Lived as 
Indians and What They Learned, Ernest Thompson Selon. Great classic of 
nature and boyhood provides a vast range of woodlore in most palatable form, a 
genuinely entertaining story. Two farm boys build a teepee in woods and live in it 
for a month, working out Indian solutions to living problems, star lore, birds and 
animals, plants, etc. 293 illustrations, vii + 286pp. 

20983-7 Paperbound $2.50 

Peter Piper’s Practical Principles of Plain & Perfect Pronunciation. 
Alliterative jingles and tongue-twisters of surprising charm, that made their first 
appearance in America about 1830. Republishetl in full with the spirited woodcut 
illustrations from this earliest American edition. 32pp. 4*/^ x 6^. 

22560-7 Paperbound Sl.OO 

Science Experiments and Amusements for Children, Charles Vivian. 73 easy 
experiments, requiring only materials found at home or easily available, such as 
candles, coins, steel wool, etc.; illustrate basic phenomena like vacuum, simple 
chemical reaction, etc. All safe. Modern, well-planned. Formerly Science Games 
for Children. 102 photos, numerous drawings. 96pp. 61/8 x 9*/^. 

21856-2 Paperbound $1.25 

An Introduction to Chess Moves and Tactics Simply Explained, Leonard 
Barden. Informal intermediate introduction, quite strong in explaining reasons for 
moves. Covers basic material, tactics, important openings, traps, positional play in 
middle game, end game. Attempts to isolate patterns and recurrent configurations. 
Formerly Chess. 58 figures. 102pp. (USO) 21210-6 Paperbound $1.25 


Lasker's Manual of Chess, Dr. Emanuel Lasker. Lasker was not only one of the 
five great World Champions, he was also one of the ablest expositors, theorists, and 
analysts. In many ways, his Manual, permeated with his philosophy of battle, filled 
with keen insights, is one of the greatest works ever written on chess. Filled with 
analyzed games by the great players, A single-volume library that will profit almost 
any chess player, beginner or master. 308 diagrams, xli X 349pp. 

20640-8 Paperbound $2.75 

The Master Book of Mathematical Recreations. Fred Schuh. In opinion of 
many the finest work ever prepared on mathematical puzzles, stunts, recreations; 
exhaustively thorough explanations of mathematics involved, analysis of effects, 
citation of puzzles and games. Mathematics involved is elementary. Translated bv 
F. Gobel. 194 figures, xxiv + 430pp. 22134-2 Paperbound $3.50 


Mathematics, Magic and Mystery, Martin Gardner. Puzzle editor for Scientific 
American explains mathematics behind various mystifying tricks: card tricks, stage 
mind reading,” coin and match tricks, counting out ganics, geometric dissections, 
etc. Probability sets, theory of numbers clearly explained. Also provides more than 
400 tricks, guaranteed to 1 35 illustrations, xii 176pp. 

20335-2 Paperbound $1.75 



Ulim LIBRARY 



145508 



CATALOGUE OF DOVER BOOKS 


Mathematical Puzzles for Beginners and Enthusiasts. Geoffrey Mott-Smhh. 
189 puzzles from easy to difficult—involving arithmetic, logic, algebra, properties 
of digits, probability, etc.—for enjoyment and mental stimulus. Explanation of 
mathematical principles behind the puzzles. 135 illustrations, viii -}- 248pp. 

20198*8 Paperbound $1.75 


Paper Folding for Beginners, William D. Murray and Francis J. Rigney. Easiest 

book on the market, clearest instructions on making interesting, beautiful origami. 

Sail boats, cups, roosters, frogs that move legs, bonbon boxes, standing birds, etc. 

40 projects; more than 275 diagrams and photographs. 94pp. 

20713-7 Paperbound $1.00 

Tricks and Games on the Pool Table, Fred Herrmann. 79 tricks and games— 
some solitaires, some for two or more players, some competitive games—to entertain 
you between formal games. Mystifying shots and throws, unusual caroms, tricks 
involving such props as cork, coins, a hat, etc. Formerly Fun on the Pool Table. 
77 figures. 95pp. 21814*7 Paperbound $1.25 

Hand Shadows to be Thrown Upon the Wall; A Series of Novel and 
Amusing Figures Formed by the Hand, Henry Bursill. Delightful picturebook 
from great-grandfather’s day shows how to make 18 different hand shadows: a bird 
that flies duck that quacks, dog that wags his tail, camel, goose, deer, boy, turtle, 
etc. Only book of its sort, vi + 33pp. 6 V 2 x 9V4- 21779-5 Paperbound $1.00 

Whittling and Woodcarving, E. J. Tangerman. 18th printing of best book on 

market ”If you can cut a potato you can carve” toys and puzzles, chains, chessmen, 

caricatures, masks, frames, woodcut blocks, surface patterns, much mote. Information 

on tools, woods, techniques. Also goes into serious wood sculpture from Middle 

A&es to present. East and West. 464 photos, figures, x -j- 293pp. 

* 20965*2 Paperbound $2.00 

History of Philosophy, Julian Marias. Possibly the clearest, most easily followed, 
best planned, most useful one-volume history of philosophy on the market; neither 
skimpy nor overfull. Full details on system of every major philosopher and dozens 
of less important thinkers from pre-Socratics up to Existentialism and later. Strorig 
on many European figures usually omitted. Has gone through dozens of editions in 
Europe. 1966 edition, translated by Stanley Appelbaum and Clarence Strowbridge^ 
xviii + 505pp. 21739*6 Paperbound $3-50 

Yoga; A Scientific Evaluation, Kovoor T. Behanan. Scientific but nori-teclmic^ 

study of physiological results of yoga exercises; done under auspices of Yale U. 

Relations to Indian thought, to psychoanalysis, etc. l 6 photos, xxiu + 270pp. 

20505*3 Paperbound $2.50 


Prleej subject to change without notice. . r'l 

Available at your book dealer or write for free catalogue to Dept. GI, Dover 
Publications, Inc., 180 Varick St.. N. Y., N. Y. 10014. Dover publishes more than 
150 books each year on science, element YV and advanc e d mathematics, 
music, art, literary history, social sciences ;|id otner 






